Author Topic: [Python] Google information harvesting program  (Read 1374 times)

0 Members and 1 Guest are viewing this topic.

Offline bonewagon

  • NULL
  • Posts: 2
  • Cookies: 4
    • View Profile
[Python] Google information harvesting program
« on: May 09, 2013, 11:22:11 pm »
I've written a small program that harvests data from Google. I named it "Pygore" (Python Google Regular Expressions).




It is like a web crawler, but a little unique, so I'll try to explain what it does:

  • To start with, it makes a user-defined Google query. The user can enter ordinary search terms or use Google operators ("Google hacking") to improve the results of the search.
  • A certain number of URLs are extracted from the results of the query. That number is defined by the user. If the user's query produced less URLs than the number the user specified, Pygore will simply extract all of them.
  • These URLs are iterated through, and Pygore visits each of them (as a web client) and downloads the HTML source for each of the web pages.
  • Pygore then searches through the HTML for a user-defined regular expression. The matches are then extracted and dumped. Optionally, the user can also dump the URLs that the matches were found at, right beside the matches themselves. Pygore can dump the results to the terminal, a line-by-line text file, or an HTML file.
(I've written it using Tkinter for the GUI, and it makes use of the xgoogle library to implement the Google searching.
Pygore is split into several modules, (rather than being a single .py script) so I uploaded it as an attachment rather than posting the source code directly. The attachment contains the source code, although the Xgoogle modules were not written by me.)
I think something like this can be useful, although I myself am probably never going to use it. Let me know what you think.
« Last Edit: May 10, 2013, 01:26:24 am by bonewagon »

Offline relax

  • Sir
  • ***
  • Posts: 562
  • Cookies: 114
  • The one and only
    • View Profile
Re: [Python] Google information harvesting program
« Reply #1 on: May 10, 2013, 12:09:03 am »
thanks will try it out when I have time




edit: actually if this works, this is an awesome tool. because I have noticed that google ignores results sometimes when you are to specific. but I know google have had restriction on there api's for search results. I don't know about xgoogle though
« Last Edit: May 10, 2013, 12:16:40 am by relax »

Offline Ragehottie

  • Knight
  • **
  • Posts: 313
  • Cookies: -9
  • Hack to learn, not learn to hack.
    • View Profile
Re: [Python] Google information harvesting program
« Reply #2 on: May 10, 2013, 01:39:22 am »
Very cool. Love seeing this much python on EZ!


Just a suggestion, but when you have that you want to directly write to the terminal, gray out the path message box.
Blog: rexmckinnon.tumblr.com

Offline kenjoe41

  • Symphorophiliac Programmer
  • Administrator
  • Baron
  • *
  • Posts: 990
  • Cookies: 224
    • View Profile
Re: [Python] Google information harvesting program
« Reply #3 on: May 10, 2013, 11:56:38 am »
Gonna try look at the code since I suck at the gui in python.
If you can't explain it to a 6 year old, you don't understand it yourself.
http://upload.alpha.evilzone.org/index.php?page=img&img=GwkGGneGR7Pl222zVGmNTjerkhkYNGtBuiYXkpyNv4ScOAWQu0-Y8[<NgGw/hsq]>EvbQrOrousk[/img]

Offline proxx

  • Avatarception
  • Global Moderator
  • Titan
  • *
  • Posts: 2803
  • Cookies: 256
  • ФФФ
    • View Profile
Re: [Python] Google information harvesting program
« Reply #4 on: May 10, 2013, 12:02:05 pm »
Nice tool.

However I dont think google is gonna like this unless you use an API key.
There are other solutions though.
Wtf where you thinking with that signature? - Phage.
This was another little experiment *evillaughter - Proxx.
Evilception... - Phage

xC

  • Guest
Re: [Python] Google information harvesting program
« Reply #5 on: May 10, 2013, 05:06:55 pm »
Yeah, looks nice. However, the new Google API only allows 100 queries a day. I don't know about the old one. +1.