Wednesday, November 24, 2010

Coding bat - Post Soutions

I am not sure whether this is a intellectual property right, but these are the exercises that I solved in coding bat, a site by Nick Parlante ( who was a inspiration for me to seriously look at Python )

ok I have been trying for an hour to hack into coding bat with my user name and password - using python urllib2 module, there seems to something that keeps me out of their pages.

I am just trying to download their tutorials and post it in my blog (the actual reason for creating this particular blog post), so i have downloaded this python module called mechanize which is very promising, have to try it out and see whether it works...

[EDIT:]
and it works !!! this is the download link for the web site

http://wwwsearch.sourceforge.net/mechanize/
and here is the coding that I used to get access to the web site and download the content

#/usr/bin/env python
import sys, os, re
from urllib2 import HTTPError

import mechanize
assert mechanize.__version__ >= (0, 0, 6, "a")
mech = mechanize.Browser()
mech.set_handle_robots(False)

try:
    mech.open("http://codingbat.com")
except HTTPError, e:
    sys.exit("%d: %s" % (e.code, e.msg))

mech.select_form(nr=0)
mech["uname"] = "your registered email id"
mech["pw"] =  "password"
mech.submit()

#s = mech.retrieve('http://codingbat.com','d:/aa.html')
#[EDIT]
#this page will throw up error stating the user or tag is expired, what you need to do is login to the web site and copy the link for the "DONE" page here
mech.retrieve('http://codingbat.com/done','d:/url.html')

content = open('d://url.html','r').readlines()
i=0
for line in content:
    urls = re.findall(r'href=[\'"]p?([^\'" >]+)', line)
    for url in urls:
            if url[0:4] == '/pro':
                i+=1
                print "Processing",'http:/codingbat.com'+url
                mech.retrieve(r'http://codingbat.com'+url,'d://myhack//'+str(i)+'.html')

#make sure you create a folder d:\myHack which will used to save the files.

No comments:

Post a Comment