Author Topic: Searching bulk URL's for specific content  (Read 1873 times)

0 Members and 1 Guest are viewing this topic.

Offline radner

  • /dev/null
  • *
  • Posts: 6
  • Cookies: 0
    • View Profile
Searching bulk URL's for specific content
« on: May 07, 2013, 04:45:01 pm »
I'm not sure how to best ask this question.

A website that frequently posts webinars that I watch doesn't really have any security to lock their webinars down. Instead they just email out a link to each webinar. You don't have to be signed in to watch them.


http://media.beaconlive.com/viewitem?bckt=99865


I noticed that if I change the number at the end of the URL (99866), I find other pages. Once in a while, I'll find another webinar that I want to watch from the same company.

Is there a program or script that would allow me to search between [size=78%]http://media.beaconlive.com/viewitem?bckt=99800[/size] and [size=78%]http://media.beaconlive.com/viewitem?bckt=99999[/size]? The search engine could use the .jpeg file that's in the header of each of the websites to yield a positive result. If the specific .jpg appears on one of the sites, that's a webinar I'd like to watch since it is a unique identifier of the company.

Sounds confusing, but thanks for the help!

Offline proxx

  • Avatarception
  • Global Moderator
  • Titan
  • *
  • Posts: 2803
  • Cookies: 256
  • ФФФ
    • View Profile
Re: Searching bulk URL's for specific content
« Reply #1 on: May 07, 2013, 04:55:23 pm »
Hello, please introduce yourself first.

Also it would be easy to code something in python with urlib.
Look for simple webscraper examples and write it yourself.
Not really that hard.
Wtf where you thinking with that signature? - Phage.
This was another little experiment *evillaughter - Proxx.
Evilception... - Phage

Offline radner

  • /dev/null
  • *
  • Posts: 6
  • Cookies: 0
    • View Profile
Re: Searching bulk URL's for specific content
« Reply #2 on: May 07, 2013, 05:08:28 pm »
Sorry about that. Intro posted. Thanks!

Offline Kulverstukas

  • Administrator
  • Zeus
  • *
  • Posts: 6627
  • Cookies: 542
  • Fascist dictator
    • View Profile
    • My blog
Re: Searching bulk URL's for specific content
« Reply #3 on: May 07, 2013, 05:13:06 pm »
Yes, this would be very easy with python. Just google around. If you start this project and have questions then you can ask us, we won't do it for you :)

Offline radner

  • /dev/null
  • *
  • Posts: 6
  • Cookies: 0
    • View Profile
Re: Searching bulk URL's for specific content
« Reply #4 on: May 07, 2013, 05:14:24 pm »
I'm already finding some great tips on Google. This tutorial looks pretty helpful:

http://www.newthinktank.com/2010/11/python-2-7-tutorial-pt-13-website-scraping/


Just wasn't sure what the right terminology was to use. Thanks!

Offline Rav3n

  • Serf
  • *
  • Posts: 30
  • Cookies: 5
    • View Profile
Re: Searching bulk URL's for specific content
« Reply #5 on: May 07, 2013, 05:26:48 pm »
If you are using windows, it will be a lot easier using batch and vbscript
This will create a text file in your desktop containing all the links to your desired webpages

Just Make sure to leave at least a single empty line at the end of the batch file after the last line in the code, or it will not work.

Code: [Select]
@Echo OFF
Color 0E

Rem Setting Variables Here
SET "IMGname=nsc-logo.jpg"                                // Word To Search For in the HTML Source
SET "RangeStart=99800"                                    // Start Range Number
SET "RangeEnd=99999"                                      // End Range Number
SET "LogFile=%userprofile%\Desktop\beaconlive.log"        // Log file That hold the Valid URLs
SET "OutFile=%temp%\HTMLsource.txt"                       // Temp File Where we save the source

Rem Delete Files from Prevoius Run
IF Exist "%Temp%\Download.vbs" DEL /F /Q "%Temp%\Download.vbs" >NUL
IF Exist "%OutFile%" DEL /F /Q "%OutFile%" >NUL
IF Exist "%LogFile%" DEL /F /Q "%LogFile%" >NUL

Echo.&Echo.&Echo  Downloading ...
Rem Getting The Number sequance and Passing the URL to The Download Script
SETLOCAL EnableDelayedExpansion
For /L %%A In ( %RangeStart% 1 %RangeEnd% ) Do (
    Rem Download URL
    CALL :Download "http://media.beaconlive.com/viewitem?bckt=%%A"
        Rem Search In the Source for the IMGname, if found log it in the log file
    For /F "delims=" %%z In (' Findstr "%IMGname%" "%OutFile%" ') Do (
        IF NOT "%%z" == "" Echo Found At: http://media.beaconlive.com/viewitem?bckt=%%A>>"%LogFile%"
    )
    Rem Delete OutFile
    Del /F /Q "%OutFile%" >NUL
)

Rem All Done, End Message
Color 0A
Echo.&Echo.&Echo.&Echo.&Echo.&Echo.&Echo.&Echo.&Echo.&Echo.
Echo                                   All Done!
Echo                              Check Your Log File
Pause >Nul
Exit /B

:Download <URL>
IF NOT EXIST "%Temp%\Download.vbs" (
   (For /F "tokens=1*" %%A In ('FINDstr "^:Download: " ^< "%~F0"') DO Echo.%%B)>"%Temp%\Download.vbs"
)
CScript //nologo "%Temp%\Download.vbs" "%~1" "%OutFile%"
GOTO :EOF
:Download: Set objArgs = WScript.Arguments
:Download: url = objArgs(0)
:Download: pix = objArgs(1)
:Download: With CreateObject("MSXML2.XMLHTTP")
:Download:  .open "GET", url, False
:Download:  .send
:Download:  a = .ResponseBody
:Download:  End With
:Download:  With CreateObject("ADODB.Stream")
:Download:  .Type = 1 'adTypeBinary
:Download:  .Mode = 3 'adModeReadWrite
:Download:  .Open
:Download:  .Write a
:Download:  .SaveToFile pix, 2 'adSaveCreateOverwrite
:Download:  .Close
:Download:  End With

REM ===================================================
REM  IMPORTANT : Must Leave Empty Line After This Line
REM ===================================================




Offline Deque

  • P.I.N.N.
  • Global Moderator
  • Overlord
  • *
  • Posts: 1203
  • Cookies: 518
  • Programmer, Malware Analyst
    • View Profile
Re: Searching bulk URL's for specific content
« Reply #6 on: May 07, 2013, 07:26:41 pm »
If you are using windows, it will be a lot easier using batch and vbscript
This will create a text file in your desktop containing all the links to your desired webpages

Just Make sure to leave at least a single empty line at the end of the batch file after the last line in the code, or it will not work.

Code: [Select]
@Echo OFF
Color 0E

Rem Setting Variables Here
SET "IMGname=nsc-logo.jpg"                                // Word To Search For in the HTML Source
SET "RangeStart=99800"                                    // Start Range Number
SET "RangeEnd=99999"                                      // End Range Number
SET "LogFile=%userprofile%\Desktop\beaconlive.log"        // Log file That hold the Valid URLs
SET "OutFile=%temp%\HTMLsource.txt"                       // Temp File Where we save the source

Rem Delete Files from Prevoius Run
IF Exist "%Temp%\Download.vbs" DEL /F /Q "%Temp%\Download.vbs" >NUL
IF Exist "%OutFile%" DEL /F /Q "%OutFile%" >NUL
IF Exist "%LogFile%" DEL /F /Q "%LogFile%" >NUL

Echo.&Echo.&Echo  Downloading ...
Rem Getting The Number sequance and Passing the URL to The Download Script
SETLOCAL EnableDelayedExpansion
For /L %%A In ( %RangeStart% 1 %RangeEnd% ) Do (
    Rem Download URL
    CALL :Download "http://media.beaconlive.com/viewitem?bckt=%%A"
        Rem Search In the Source for the IMGname, if found log it in the log file
    For /F "delims=" %%z In (' Findstr "%IMGname%" "%OutFile%" ') Do (
        IF NOT "%%z" == "" Echo Found At: http://media.beaconlive.com/viewitem?bckt=%%A>>"%LogFile%"
    )
    Rem Delete OutFile
    Del /F /Q "%OutFile%" >NUL
)

Rem All Done, End Message
Color 0A
Echo.&Echo.&Echo.&Echo.&Echo.&Echo.&Echo.&Echo.&Echo.&Echo.
Echo                                   All Done!
Echo                              Check Your Log File
Pause >Nul
Exit /B

:Download <URL>
IF NOT EXIST "%Temp%\Download.vbs" (
   (For /F "tokens=1*" %%A In ('FINDstr "^:Download: " ^< "%~F0"') DO Echo.%%B)>"%Temp%\Download.vbs"
)
CScript //nologo "%Temp%\Download.vbs" "%~1" "%OutFile%"
GOTO :EOF
:Download: Set objArgs = WScript.Arguments
:Download: url = objArgs(0)
:Download: pix = objArgs(1)
:Download: With CreateObject("MSXML2.XMLHTTP")
:Download:  .open "GET", url, False
:Download:  .send
:Download:  a = .ResponseBody
:Download:  End With
:Download:  With CreateObject("ADODB.Stream")
:Download:  .Type = 1 'adTypeBinary
:Download:  .Mode = 3 'adModeReadWrite
:Download:  .Open
:Download:  .Write a
:Download:  .SaveToFile pix, 2 'adSaveCreateOverwrite
:Download:  .Close
:Download:  End With

REM ===================================================
REM  IMPORTANT : Must Leave Empty Line After This Line
REM ===================================================




So you just spoonfeed radner. How should he learn anything from that?

Offline Rav3n

  • Serf
  • *
  • Posts: 30
  • Cookies: 5
    • View Profile
Re: Searching bulk URL's for specific content
« Reply #7 on: May 07, 2013, 07:37:24 pm »
 ??? Didn't take that in consideration  :-[ ,
However i learned VBscript from existing examples.

BTW, radner, This is an interactive course for learning python
http://interactivepython.org/courselib/static/thinkcspy/index.htm
Also, try Codeacademy.com , it has many other programming language including Python
« Last Edit: May 07, 2013, 07:37:44 pm by Rav3n »

Offline radner

  • /dev/null
  • *
  • Posts: 6
  • Cookies: 0
    • View Profile
Re: Searching bulk URL's for specific content
« Reply #8 on: May 11, 2013, 01:34:52 am »
No worries. I'd rather take the time to learn Python. Thanks for the code though! Super helpful. I've been studying it and it has given me a good "road map".


Thanks!

Offline Evilone

  • /dev/null
  • *
  • Posts: 16
  • Cookies: 14
    • View Profile
Re: Searching bulk URL's for specific content
« Reply #9 on: May 11, 2013, 09:48:46 pm »
Forget Python, take the time to learn Linux and BASH. You could code this up in a simple BASH one-liner and learn about regular expressions and Linux commands all in one swoop

Offline radner

  • /dev/null
  • *
  • Posts: 6
  • Cookies: 0
    • View Profile
Re: Searching bulk URL's for specific content
« Reply #10 on: May 12, 2013, 05:17:41 am »
Thanks! Do you have any recommended reading for Linux or BASH?

Offline vezzy

  • Royal Highness
  • ****
  • Posts: 771
  • Cookies: 172
    • View Profile
Re: Searching bulk URL's for specific content
« Reply #11 on: May 12, 2013, 05:55:09 am »
Shell scripting tutorials are all over the Internet and are only a search away. I'm pretty sure Evilzone had a brief tut on it, too...

More specifically, he's referring to Unix command-line utilities that can be used to parse, extract and otherwise manipulate text through regexes and general commands, such as sed and awk. awk is a mini-language in of itself, but it should be simple to pick up. Look them up.

Don't scrap Python, though. It's a very versatile language that can be a lifesaver for smaller projects and general scripting work, and it's a useful skill to have.


Quote from: Dippy hippy
Just brushing though. I will be semi active mainly came to find a HQ botnet, like THOR or just any p2p botnet

Offline Rav3n

  • Serf
  • *
  • Posts: 30
  • Cookies: 5
    • View Profile
Re: Searching bulk URL's for specific content
« Reply #12 on: May 12, 2013, 09:09:47 am »
This is where i learned bash script, it's the easiest tutorial i ever tried : http://lowfatlinux.com/

if you using windows, install cygwin, it will help you learn bash and still run windows.
« Last Edit: May 12, 2013, 09:13:54 am by Rav3n »