Python, web forms and cookies
- Magnus Therning
Just the other day I finally got around to something that I’ve wanted to play around with for a fairly long time—posting web forms using python. As an added bonus I also took a look at dealing with cookies in Python.
For posting forms there is of course a module that makes things a lot easier, mechanize, but I wanted to first of all understand how to do it myself and secondly to avoid using anything but the standard Python modules. It turns out there isn’t much to understand. Say that we have a very simple form, say it’s a login form containing two text entries:
<form method="post" action="/login">
<label for="user_name">username</label>
<input type="text" name="user_name" id="user_name" value="" />
<label for="password">password</label>
<input type="password" name="password" id="password" class="sized" />
<input type="submit" class="button" name="login" value="log in" />
</form>
One way to post this form would be the following:
import urllib
import urllib2
= urllib.urlencode({'user_name' : 'foo', 'password' : 'bar'})
login_data = urllib2.urlopen('http://url.for.my.site/login', login_data) resp
Simple enough, I’d say. urllib2.urlopen
automatically switches from GET
to POST
on the existance of some data.
On most sites a cookie is used to track whether a user is logged in or not. Extending the example above to deal with this and enable subsequent requests to the site as a logged-in user leads us to the CookieJar
:
import urllib
import urllib2
import cookielib
= cookielib.CookieJar()
cj = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener = urllib.urlencode({'user_name' : 'foo', 'password' : 'bar'})
login_data = opener.open('http://url.for.my.site/login', login_data) resp
After this cj
will hold all the cookies returned in the response. You can enumerate over them like this:
for c in enumerate(cj):
print c.name, c.value
Making requests with a cookie c
is simple as well, just add c
to the cookie jar before making the request:
cj.set_cookie(c)
The cookie jar also has a policy object and a method, set_cookie_if_ok
that will set a cookie for a specific request only if the policy allows it. I.e. it seems fairly simple to make sure there is no cookie leakage when making requests to multiple sites. I’ll leaving playing with that for another day though.