Does your favorite web service have a crappy interface? Make your own with Python, Python-Requests and BeautifulSoup!

There are some pretty useful sites out there, but some interfaces are just plain annoying.
Take pof.com for example: they have millions of users, but haven’t touched their interface since the beginning; if you get lots of messages it becomes a pain to go through them all very quickly.
I figured it might be easier to use my hacking skills to create my own interface.
Step 1: Login
First take a look at the source code of the form on the login page:

<form action="https://www.pof.com/processLogin.aspx" method="post" id="frmLogin" name="frmLogin" class="form right">
	<div id="login-box">
		<input name="url" id="url" class="title" type="hidden">
		<input name="username" id="username" class="title input" type="text" value="l33tman">
        <label class="headline txtBlue size12 label username" for="username">Username</label>
		<input name="password" id="password" class="title input" type="password">
		<label class="headline txtBlue size12 label password" for="password">Password</label>
        <script type="text/javascript">
            var nowt = new Date(),
                tempt_F = nowt.getTimezoneOffset();
            document.write('<input type='hidden' value='' + tempt_F + '' name='tfset'/>');
        </script><input type="hidden" value="300" name="tfset">
		<input name="login" id="login" class="button norm-blue submit" type="submit" value="Check Mail!">
        <input name="callback" id="callback" type="hidden" value="http%3a%2f%2fwww.pof.com%2fstart.aspx">
        <input name="sid" id="sid" type="hidden" value="wcqugtcmwbpb2rvn345x4mxk">
	</div>
    <script type="text/javascript">
        if (document.getElementsByTagName("html").lang == undefined || document.getElementsByTagName("html").lang == null) {
            var html = document.getElementsByTagName("html")[0];
            html["lang"] = "en";
        }
    </script>
</form>

We will use python-requests to make all our requests with a simulated user session. See http://docs.python-requests.org/en/latest/ for more details.
We will start by passing in all those input values to requests.post:

import requests
session = requests.session()
payload = dict(username=username,
               password=password,
               tfset="300",
               callback="http%3a%2f%2fwww.pof.com%2finbox.aspx",
               sid="wcqugtcmwbpb2rvn345x4mxk")
response = session.post("http://pof.com/processLogin.aspx", data=payload)

By using session.post instead of the plain request.post, we retain all the cookie information necessary to simulate an actual logged in user.
Step 2: Collect the message links
BeautifulSoup makes parsing html extremely simple. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/ for the docs.
Say we have an html string and we’d like to find all the html elements with the “message” class. Here’s how we would do that with BeautifulSoup:

soup = BeautifulSoup(html)
for message in soup.find_all('a', 'message'):
    # process your message

In step 1 we logged into pof.com and got a response object. We can pass the html contents of this object to BeatifulSoup to begin parsing.
For our case we need the next_page link and the links to the messages (the html code of POF is terrible, so some hackery was necessary to get the elements properly):

soup = BeautifulSoup(response.text)
next_page = soup.find('a', text='Next Page')attrs['href']
message_links = []
for message_link in soup.find_all(attrs={'href': re.compile('viewallmessages.*')}):
        message_links.append(message_link.attrs['href'])

Step 3: Collect the content
We now need to go to each link and fetch the message content and user data.
Continuing to use the session object for all requests, we get:

def parse_all_messages(links):
    messages = []
    for link in links:
        comment_page = session.get(link)
        soup = BeautifulSoup(comment_page.text)
        for message in soup.find_all(attrs={'style': re.compile('width:500px.*')}):
            user = soup.find('span', 'username-inbox')
            user_image_url = soup.find('td', attrs={'width':"60px"}).img.attrs['src']
            messages.append(dict(user_username=clean_string(user.text),
                                 user_url=pof_url(user.a.attrs['href']),
                                 user_image_url=user_image_url,
                                 date=user.parent.find('div').text,
                                 message=clean_string(message.text)))
    return sorted(messages, key=lambda m: to_date(m['date']), reverse=True)

Step 4: Pretty Print the data
I have opted to use Jinja2 to render the html, but this is not at all necessary. Jinja2 is a simple templating library that is used in many python web frameworks. See http://jinja.pocoo.org/docs/ for a more in depth tutorial.
It’s fairly simple to use:

>>> from jinja2 import Template
>>> template = Template('Hello {{ name }}!')
>>> template.render(name='John Doe')
u'Hello John Doe!'

Be careful to properly encode your strings when using Jinja2. POF has some malformed characters which required cleaning the strings with “”.encode(‘ascii’, ‘ignore’)
Step 5: Run it!
Below is the script in its entirety.

####################################################################
# my_pof_messages.py
#
# A simple script to scrape your pof messages and
# print them to single html file. Also outputs to json.
#
# Usage:
# sudo pip install beautifulsoup4 requests jinja2
# python my_pof_messages.py <username> <password> <output_prefix>
# firefox output_prefix.html
#
# Author:
# Ramin Rahkhamimov
# ramin32@gmail.com
# http://raminrakhamimov.com
#####################################################################
import requests
from bs4 import BeautifulSoup
import re
from jinja2 import Template
import json
import sys
from datetime import datetime
pof_url = lambda x: "https://www.pof.com/%s" % x
session = requests.session()
def append_message_links(e, links):
    soup = BeautifulSoup(e.text)
    for a in soup.find_all(attrs={'href': re.compile('viewallmessages.*')}):
        links.append(pof_url(a.attrs['href']))
    next_page = soup.find('a', text='Next Page')
    return next_page and pof_url(next_page.attrs['href'])
def get_all_message_links(username, password):
    links = []
    payload = dict(username=username,
                   password=password,
                   tfset="300",
                   callback="http%3a%2f%2fwww.pof.com%2finbox.aspx",
                   sid="ikdnixh1pblvis1dlqaa0mb3")
    e = session.post(pof_url("processLogin.aspx"), data=payload)
    next_page = append_message_links(e, links)
    while next_page:
        e = session.get(next_page)
        next_page = append_message_links(e, links)
    return set(links)
def clean_string(string):
    return string.encode('ascii', 'ignore')
def to_date(date_string):
    return datetime.strptime(date_string, '%m/%d/%Y %I:%M:%S %p')
def parse_all_messages(links):
    messages = []
    for link in links:
        comment_page = session.get(link)
        soup = BeautifulSoup(comment_page.text)
        for message in soup.find_all(attrs={'style': re.compile('width:500px.*')}):
            user = soup.find('span', 'username-inbox')
            user_image_url = soup.find('td', attrs={'width':"60px"}).img.attrs['src']
            messages.append(dict(user_username=clean_string(user.text),
                                 user_url=pof_url(user.a.attrs['href']),
                                 user_image_url=user_image_url,
                                 date=user.parent.find('div').text,
                                 message=clean_string(message.text)))
    return sorted(messages, key=lambda m: to_date(m['date']), reverse=True)
def save_messages(messages, prefix):
    template = Template("""
    <html>
    <head>
        <style>
            .user, .message, .date {
                display: inline-block;
                vertical-align: top;
            }
            .message {
                width: 500px;
                padding-left: 10px;
            }
        </style>
    </head>
    <body>
        <ol>
        {% for message in messages %}
            <li>
            <a href="{{message.user_url}}" class="user">
            <img src="{{message.user_image_url}}"/>
            <div>
                {{message.user_username}}
            </div>
            </a>
            <div class="message">
                {{message.message}}
            </div>
            <div class="date">
                {{message.date}}
            </div>
            </li>
        {% endfor %}
        </ol>
    </body>
    </html>
    """)
    with open('%s.html' % prefix, 'w') as f:
        f.write(template.render(messages=messages))
    with open('%s.json' % prefix, 'w') as f:
        f.write(json.dumps(messages))
if __name__ == '__main__':
    if len(sys.argv) != 4:
        print "Usage: my_pof_messages.py <username> <password> <output_prefix>"
    links = get_all_message_links(sys.argv[1], sys.argv[2])
    messages = parse_all_messages(links)
    save_messages(messages, sys.argv[3])

Install requests, beautifulsoup4 and jinja2 and run with python. Depending on your inbox size, this may take a couple of minutes. Once the script is done running, open the newly create html file with your favorite browser:

sudo pip install requests beautifulsoup4 jinja2
python my_pof_messages.py your_username your_password output
firefox output.html

This script can be easily tweaked to be used with your favorite service provider.

Push notifications with Flask, EventSource and Redis

  1. Save the file below as push_notifications.py
  2. Run python push_notifications.py
  3. Open a bunch of browsers pointing to http:localhost:5000/
  4. Open up each browser’s console window.
  5. Hit submit in one, the rest should update.

push_notifications.py:

import flask, redis
app = flask.Flask(__name__)
red = redis.StrictRedis(host='localhost', port=6379, db=0)
def event_stream():
    pubsub = red.pubsub()
    pubsub.subscribe('notifications')
    for message in pubsub.listen():
        print message
        yield 'data: %snn' % message['data']
@app.route('/post', methods=['POST'])
def post():
    red.publish('notifications', 'Hello!')
    return flask.redirect('/')
@app.route('/stream')
def stream():
    return flask.Response(event_stream(),
        mimetype="text/event-stream")
@app.route('/')
def index():
    return '''
<html>
<head>
    <script>
        var source = new EventSource('/stream');
        source.onmessage = function (event) {
             console.log(event.data);
        };
    </script>
</head>
<body>
    <form method="POST" action="/post">
        <input type="submit"/>
    </form>
</body>
</html>
'''
if __name__ == '__main__':
    app.debug = True
    app.run(threaded=True)

You need to have Flask and redis installed:

sudo apt-get install python-pip redis-server
sudo pip install flask redis

Post to Facebook on Behalf of Your User with Python/Django

Posting on facebook through python is fairly simple: you need acquire your user’s facebook access token, which you use to access the users facebook wall.
Step 1: Create the facebook app
First, create a new facebook app on https://developers.facebook.com/apps and note down the APP_ID and APP_SECRET. Store this information in your django settings.py file.
Step 2: Acquire an oauth code
To acquire an oauth code, simply redirect your user to the following url:
https://graph.facebook.com/oauth/authorize along with these arguments:

'client_id': Your facebook app id
'redirect_uri': The url of your web app which will handle the redirect back from facebook
'scope': types of privileges to request from the user

We get:

def acquire_facebook_oauth_code(request):
    redirect_uri = 'http://your_cool_web_app/handle_facebook_redirect/'
    attrs = {'client_id': settings.FACEBOOK_APP_ID,
             'redirect_uri': redirect_uri,
             'scope':'offline_access,publish_stream,manage_pages'}
    code_url = 'https://graph.facebook.com/oauth/authorize?%s'  % urllib.urlencode(attrs)
    return redirect(code_url)

Step 3: Handle the redirect from facebook
Upon successful signin and approval of your user, facebook will redirect back to the ‘redirect_uri’ used above, with an additional ‘code’ argument in the following format:
http://your_cool_web_app/handle_facebook_redirect/?code=……………
We can now use this code to acquire the access_token:

def handle_facebook_redirect(request):
    code = request.GET.get('code')
    attrs = {'client_id': settings.FACEBOOK_APP_ID,
             'client_secret': settings.FACEBOOK_APP_SECRET,
             'code': code }
    access_token_url = 'https://graph.facebook.com/oauth/access_token?%s' % urllib.urlencode(attrs)
    r = requests.get(access_token_url)
    access_token = urlparse.parse_qs(r.content)['access_token'][0]
    # save the access_token for your user
    user_profile = request.user.get_profile()
    user_profile.facebook_access_token = access_token
    user_profile.save()
    return redirect('facebook_succeeded_page')

Step 4: Final step, use the access token to share on the facebook wall
Now that we have the access token saved, we can update the users status by sending a post request to https://graph.facebook.com/feed:

@login_required
def share_on_facebook(request):
    payload = dict(message="Your fancy facebook wall status"),
                   access_token=request.user.get_profile().facebook_access_token)
    req = requests.post('https://graph.facebook.com/feed', data=payload)
    return redirect('facebook_share_succeeded_page')

Make sure to import urllib, urlparse and requests when using the above code. The first two are built into python but requests can be installed with:

sudo pip install requests

Crop your images with CSS!


Instead of tracking down your language’s fancy image libraries, you can crop images very quickly with css.
Simply use negative absolute positioning!
HTML:

<div class="crop">
    <img src="https://www.google.com/images/srpr/logo3w.png" alt="" />
</div>

CSS:

.crop  {
    height: desiredImageHeight;
    width: desiredImageWidth;
    overflow: hidden;
    position: relative;
}
.crop img{
    position: absolute;
    top: -pixelsToRemoveFromTop;
    left: -pixelsToRemoveFromLeft;
}

If you don’t know the size of the image beforehand, you can use some javascript to compute the height and width:

// Assumes img is inside a div with same width and height and overflow: hidden
function centerCrop(img, width, height) {
    function repositionImgHelper() {
        var top = (this.height - height)/2;
        var left = (this.width - width)/2;
        if ( top > 0 ) {
            img.css('top', -top+'px');
        }
        if (left > 0) {
            img.css('left', -left+'px');
        }
        return true;
    }
    function repositionImg(imgPath) {
        var myImage = new Image();
        myImage.name = imgPath;
        myImage.onload = repositionImgHelper;
        myImage.src = imgPath;
    }
    repositionImg(img.attr('src'));
}
$(function () {
    var jqueryImageObject = $('.crop img');
    centerCrop(jqueryImageObject, desiredWidth, desiredHeight);
});

Setting up iOS Push Notifications (APNS) with Python/Django through pyapns


Working with iOS for the first time can be a bit frustrating. For me, the most frustrating part was working with developer.apple.com.
Here I hope to help make push notifications with Python as painless as possible.

Step 1: Enable for Apple Push Notification service and generate a certificate

  1. To get started, log in to you developer account.
  2. Click on iOS Provisioning Portal
  3. Click on App IDs
  4. If you already have an app id, click on “configure” otherwise create a new one, and then, click “configure”.
  5. Check “Enable for Apple Push Notification service”, then click on the “Configure button”, under “Development Push SSL” Certificate.
  6. Launch Keychain Access  (Command + Space + “Keychain Access”)
  7. Within the Keychain Access drop down menu, select Certificate Assistant, then select Request a Certificate from a Certificate Authority
  8. In the Certificate Information window, enter the following information:
    1. In the User Email Address field, enter your email address
    2. In the Common Name field, create a name for your private key
    3. In the Request is group, select the “Saved to disk” option
    4. Click Continue within Keychain Access to complete the CSR generating process
    5. Upload the CSR file you just created in the new developer.apple.com dialog window.
  9. Apple will now generate an SSL Certificate, for you to download. Download and open the newly generated certifcate.
  10. This should open up Keychain Assistant, in which you will see “Apple Development Push Services …”; right click on this item and select to “Export”. Save as “apns-dev-cert.p12”.
  11. Now, expand “Apple Development Push Services …”. Right click on the private key. Export this one as well, as “apns-dev-key.p12”.

Step 2: Convert the certificate to PEM format

  1. Load up Terminal (Command + Space + “Terminal”), and cd into the directory where you saved your *.p12 files.
  2. Convert the p12 fiels to PEM format with:
    openssl pkcs12 -clcerts -nokeys -out apns-dev-cert.pem -in apns-dev-cert.p12
    openssl pkcs12 -nocerts -out apns-dev-key.pem -in apns-dev-key.p12
    
  3. Remove the password with:
    openssl rsa -in apns-dev-key.pem -out apns-dev-key-noenc.pem
    
  4. Combine the key and cert files into apns-dev.pem, which we will use when connecting to APNS with Python:
    cat apns-dev-cert.pem apns-dev-key-noenc.pem > apns-dev.pem
    
  5. Upload apns-dev.pem to your server where you will be sending out push notifications from.

Step 3: Set up your iOS app to receive notifications

  1. In your AppDelegate.m file, override the following methods:
    - (BOOL)application:(UIApplication *)application didFinishLaunchingWithOptions:(NSDictionary *)launchOptions
    {
        UIRemoteNotificationType allowedNotifications = UIRemoteNotificationTypeAlert
        | UIRemoteNotificationTypeSound
        | UIRemoteNotificationTypeBadge;
        [[UIApplication sharedApplication] registerForRemoteNotificationTypes:allowedNotifications];
        return YES;
    }
    - (void)application:(UIApplication *)application
    didRegisterForRemoteNotificationsWithDeviceToken:(NSData *)deviceToken
    {
        NSString * tokenAsString = [[[deviceToken description]
                                     stringByTrimmingCharactersInSet:[NSCharacterSet characterSetWithCharactersInString:@"<>"]]
                                     stringByReplacingOccurrencesOfString:@" " withString:@""];
        NSURL* url = [NSURL URLWithString:[NSString stringWithFormat:@"http://your_server.com/add_device_token/%@/", tokenAsString]];
        NSMutableURLRequest* request = [[NSMutableURLRequest alloc] initWithURL:url];
        [NSURLConnection connectionWithRequest:request delegate: self];
    }
    

    The first method enables push notifications; the second one sends each device’s unique device token to your django app. Make sure to update the url to your django server.

Step 4: Add the necessary code to your Django app to store new iOS device tokens

  1. Add the following model to your models.py file:
    class DeviceToken(models.Model):
        token = models.CharField(max_length=255, unique=True)
        def __unicode__(self):
            return self.token
    
  2. Add the following url route to your Django urls file:
    url(r'^add_device_token/(?P[0-9a-z]+)/$', 'add_device_token'),
    
  3. Add the following to your views.py file:
    def add_device_token(request, token):
        try:
            device_token = models.DeviceToken(token=token)
            device_token.save()
        except IntegrityError:
            return HttpResponse('Token already added!')
        return HttpResponse('success')
    

Step 5: Set up pyapns

  1. Install pyapns with:
    sudo pip install pyapns
    
  2. Ensure that pyapns is properly installed by loading up the python interpreter and running:
    $ python
    Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53)
    [GCC 4.5.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import pyapns
    >>>
    
  3. Start the pyapns server (you should also add this to your server startup script):
    twistd -r epoll web --class=pyapns.server.APNSServer --port=7077
    
  4. Create a util.py file in your project, if it doesn’t already exist, and add the following class into it:
    import pyapns
    import xmlrpclib
    class PyapnsWrapper(object):
        def __init__(self, host, app_id, apns_certificate_file, mode='sandbox'):
            self.app_id = app_id
            pyapns.configure({'HOST': host})
            pyapns.provision(app_id,
                             open(apns_certificate_file).read(),
                             mode)
        def notify(self, token, message):
            try:
                pyapns.notify(self.app_id,
                              token,
                              {'aps':{'alert': message}})
            except xmlrpclib.Fault, e:
                print e
    
  5. Add the correct APNS variables to your settings.py file:
    APP_ID = 'your_app' # MAKE SURE THIS DOESN'T CONTAIN ANY PERIODS!
    APNS_HOST = 'http://localhost:7077/'
    APNS_CERTIFICATE_LOCATION = /path/to/apns-dev.pem # Created in step 2
    
  6. Next in the file where you would like to fire push notifications add the following:
    from django.conf import settings
    import util
    from models import DeviceToken
    pyapns_wrapper = util.PyapnsWrapper(settings.APNS_HOST,
                                        settings.APP_ID,
                                        settings.APNS_CERTIFICATE_LOCATION)
    def send_notifications(message):
        for token in DeviceToken.objects.all():
            pyapns_wrapper.notify(token.token, message)
    
  7. Now just call send_notifications(“your lovely message”). This should make all your iOS devices magically receive the message! Note that only those devices whose serial numbers have been added to the development provisioning profile will receive messages.

Step 6: Enjoy!

How to Make a Bootable Linux USB on Mac OS X


I’ve tried using disk utility and didn’t seem to get anywhere. Using linux’s dd command seems to work best.
Step 1.
Open up the Terminal app and enter in:
diskutil list
Figure out what your usb device is called – mine was called /dev/disk1.
Step 2.
Unmount the disk:
diskutil unmountDisk /dev/disk1
Step 3.
Copy the iso to the USB:
dd if=image.iso of=/dev/disk1 bs=8192
Step 4.
Eject the disk:
diskutil eject /dev/disk1

Threading Example in Python

Running multiple threads in Python is actually quite simple:

import threading
def print_items(items, thread_name=None):
    print "Thread %s started..." % thread_name
    print items
    print "Thread %s ended..." % thread_name
for i in xrange(10):
    items = range(10)
    # Create a new thread which will run print_items, passing in args and kwargs. Starts immediately.
    threading.Thread(target=print_items, args=[items], kwargs={'thread_name': i}).start()
ramins-Mac-Pro:~ ramin$ python threading_example.py
Thread 0 started...
[Thread 1 started...
[Thread 2 started...
00, , 1, 1, [0Thread 3 started...
, 22, 3[0Thread 4 started...
, , 3[0, 1, 14Thread 5 started...
, 41, , , , 52, 3, 4, , 5[0, 56, , 67, , 7, , 22, Thread 6 started...
8, 1, 6, 3[08, 3, , 4, 42, , 59, 3, 46, 7, , Thread 7 started...
]
, 597, 6]
[0, , 8, , 85, , 6Thread 1 ended...Thread 0 ended..., 9, Thread 8 started...
]
9]
1, Thread 3 ended...1
, , 2, Thread 2 ended...2
, 3[0
7, 7, 3, 48, 9], ,
, 15, 46, 7, Thread 4 ended...8
, , 28, 3, , 4, , 55, 9],
6, 9Thread 6 ended...
7, 8, 9]6
, 7]
, 8 Thread 9 started...
Thread 5 ended...
, Thread 7 ended...
9][0,
Thread 8 ended...1, 2, 3, 4
, 5, 6, 7, 8, 9]
Thread 9 ended...
ramins-Mac-Pro:~ ramin$

Painless Django Settings.py Deployments

Instead of constantly changing settings.py for different environment, automate the the process with some simple python:

import socket
ROOT = os.path.dirname(os.path.realpath(__file__)) # Use this wherever an absolute path is necessary.
if socket.gethostname().startswith('your_production_hostname'):
    SITE_URL = 'http://...'
    DEBUG = False
    ENV = 'PRODUCTION'
else:
    SITE_URL = 'http://localhost:8000'
    DEBUG = True
    ENV = 'DEBUG'

The Python or Operator Shortcut

If statements are one of the most basic concepts used in all programming languages. Unfortunately they also make code ugly, unreadable, and sometimes even inefficient. Check out this google talk on if statements: http://www.youtube.com/watch?v=4F72VULWFvc
Using the or operator we can avoid ifs when checking for null values before assignment:
Traditional if:

if foo != None:
    x = foo
else:
    x = ‘some default value’

Using the or operator:

x = foo or ‘some default value’