I got a little bored this afternoon and decided to learn a bit of regex. Then I decided to see if I could fetch the most recent posts section from good ol' ez using said method. I guess it worked.
It prints out the topic, who posted, and in which section. Then below, it prints the url. If you are running *nix you should be able to right click and go to url which would obviously open it in your browser.
Example:
Re: How to find the owner name and details of a mobile number? by boriswc (General discussion)
http://evilzone.org/general-discussion/how-to-find-the-owner-name-and-details-of-a-mobile-number/msg73088/
Re: Best free proxy? by boriswc (Hacking and Security)
http://evilzone.org/hacking-and-security/best-free-proxy/msg73087/
Feel free to remove the colors if you they hurt your eyes. Do so by just removing the red.format() and blue.format()'s
import urllib
import urllib2
import re
red = "\033[01;31m{0}\033[00m"
blue = "\033[1;36m{0}\033[00m"
url = ('http://www.evilzone.org/')
sock = urllib.urlopen(url)
ch = sock.read()
sock.close()
x = ch.find('<dl id="ic_recentposts" class="middletext">')
patingr = re.compile('<strong><a.+?>.+?</a></strong> by <a.+?>.+?</a> \(<a.+?>.+?</a>\)')
list = patingr.findall(ch, x)
print
for s in list:
print blue.format(re.sub('<[^>]*>', '', s))
url = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', s)
for l in url:
if not "profile" in l and ";PHPSESSID" in l:
print red.format(''.join((l.split(';', 1)[0]).split('?topicseen', 1)))
print
Also, I am relatively new to regex so keep that in mind. I know using .+? is probably not the most efficient way, but it gets the job done. However I would very much like to know if you guys would do it a different way.
Enjoy