I got a little bored this afternoon and decided to learn a bit of regex. Then I decided to see if I could fetch the most recent posts section from good ol' ez using said method. I guess it worked.
It prints out the topic, who posted, and in which section. Then below, it prints the url. If you are running *nix you should be able to right click and go to url which would obviously open it in your browser.
Re: How to find the owner name and details of a mobile number? by boriswc (General discussion)
Re: Best free proxy? by boriswc (Hacking and Security)
Feel free to remove the colors if you they hurt your eyes. Do so by just removing the red.format() and blue.format()'s
import urllib
import urllib2
import re
red = "\033[01;31m{0}\033[00m"
blue = "\033[1;36m{0}\033[00m"
url = ('http://www.evilzone.org/')
sock = urllib.urlopen(url)
ch = sock.read()
x = ch.find('<dl id="ic_recentposts" class="middletext">')
patingr = re.compile('<strong><a.+?>.+?</a></strong> by <a.+?>.+?</a> \(<a.+?>.+?</a>\)')
list = patingr.findall(ch, x)
for s in list:
print blue.format(re.sub('<[^>]*>', '', s))
url = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', s)
for l in url:
if not "profile" in l and ";PHPSESSID" in l:
print red.format(''.join((l.split(';', 1)[0]).split('?topicseen', 1)))
Also, I am relatively new to regex so keep that in mind. I know using .+? is probably not the most efficient way, but it gets the job done. However I would very much like to know if you guys would do it a different way.