Author Topic: Email harvesting. (Read 1870 times)

hack3rcon · « **on:** September 22, 2015, 12:02:38 pm »

Hello.
I used "The Harvester" for finding emails of a web site but this program can't work very well and sometime find fake emails. For example, "www.example.com" put the administrator email on the first page but this program can't find it.
Any better tools?

Thank you.

gray-fox · « **Reply #1 on:** September 22, 2015, 07:15:06 pm »

Quote from: hack3rcon on September 22, 2015, 12:02:38 pm

Any better tools?

Make yourself one that suits your needs. Could be nice little project.

proxx · « **Reply #2 on:** September 22, 2015, 07:43:51 pm »

Quote from: gray-fox on September 22, 2015, 07:15:06 pm

Make yourself one that suits your needs. Could be nice little project.

Agreed , also quite an easy one to hop on board.

ram1r3z0 · « **Reply #3 on:** September 22, 2015, 07:53:40 pm »

The harvester is pretty good when you know what are you doing. Another interesting stuff is Foca
It parses the webpage and try to find out any interesting stuff including emails
https://www.elevenpaths.com/labstools/foca/index.html

khofo · « **Reply #4 on:** September 22, 2015, 08:38:20 pm »

Quote from: ram1r3z0 on September 22, 2015, 07:53:40 pm

The harvester is pretty good when you know what are you doing. Another interesting stuff is Foca
It parses the webpage and try to find out any interesting stuff including emails
https://www.elevenpaths.com/labstools/foca/index.html

Foca is a very nice GUI based app, I think it's Windows only though. It grabs document, finds similar domains etc.. too. It's fairly outdated though, no recent update

kenjoe41 · « **Reply #5 on:** September 23, 2015, 12:52:48 am »

The tool will get what is already there right? So just look with your eyes. If you are tired and want to automate the process, write a little quick script to grab all the absolute urls [urls with that domain in them], recursively run a regex on the text in there and you have your damn emails.

Advise me if they use any other techniques except prolly look for more pages to scrape or somethings.

Script could be read in a matter of minutes of you are a seasoned programmer or 30 minutes tops if you are googling this stuff. Less time than it takes you to get a constructive answer from la forums since we leave in different time zones and are in different moods at certain times of the day.

Next time, a howto answer might get better results than a feed the trolls question.
Now get coding and show some seriousness. We don't learn by being lazy but by accepting advice plus criticism then we move on as we learn.

Where is that script you have written for the community?

hack3rcon · « **Reply #6 on:** September 25, 2015, 05:53:51 pm »

Can I do it via telnet to server or something like that?

gray-fox · « **Reply #7 on:** September 25, 2015, 09:14:40 pm »

I'll be nice and give you a start.
Didn't test it much but basically it should find links on given site and scrape emails from every link.

Code: (bash) [Select]

#!/bin/bash

links=$(curl --silent http://example.com | sed -n 's/.*href="\([^"]*\).*/\1/p')

while IFS= read -r link; do
    
    curl "$link" 2>&1 | grep -E -o "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b"
done <<< "$links"

Figure out how it works if you can't directly tell. Expand it and make it work better. For example figure out how to pass starting site as argument, make same url validation so it won't scrape outside base url etc. etc.
Not saying bash is anyway optimal solution but you can expand idea to python or to anything you want. So do what Kenjoe41 said; do some coding and post your makings for community to see.

Edit: About that telnet. Basically yes I guess but it's hard to use in script. Netcat works bit differently in this case and with nc you can just,

Code: [Select]

echo GET | nc example.com 80

to get src.

hack3rcon · « **Reply #8 on:** September 26, 2015, 05:33:03 pm »

Thank you.
I changed :

#!/bin/bash
links=$(curl --silent evilzone.org | sed -n 's/.*href="$[^"]*$.*/\1/p')
while IFS= read -r link; do
curl "$link" 2>&1 | grep -E -o "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b"
done <<< "$links"

but no output?

Quote from: gray-fox on September 25, 2015, 09:14:40 pm

I'll be nice and give you a start.
Didn't test it much but basically it should find links on given site and scrape emails from every link.
Code: (bash) [Select]
#!/bin/bash links=$(curl --silent http://example.com | sed -n 's/.*href="$[^"]*$.*/\1/p') while IFS= read -r link; do curl "$link" 2>&1 | grep -E -o "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b" done <<< "$links"
Figure out how it works if you can't directly tell. Expand it and make it work better. For example figure out how to pass starting site as argument, make same url validation so it won't scrape outside base url etc. etc.
Not saying bash is anyway optimal solution but you can expand idea to python or to anything you want. So do what Kenjoe41 said; do some coding and post your makings for community to see.

Edit: About that telnet. Basically yes I guess but it's hard to use in script. Netcat works bit differently in this case and with nc you can just,
Code: [Select]
echo GET | nc example.com 80to get src.

gray-fox · « **Reply #9 on:** September 26, 2015, 06:01:53 pm »

Did you figure out how the script exactly works? No output means that it didn't find email regex pattern matches from evilzone.org or from any links in evilzone.org(referring to url not the whole site content). Change evilzone.org for example to http://pastebin.com/XLnHKAR5 (first thing I found with lots of email addresses) to see difference. You could make it to echo "$link" so you will see which link it's checking.

You should have figured out that it doesn't do what normal scraper does and search from link to link instead it checks just links from main page. To do it work more advanced is up to you, not going to start doing it for you. But as I said in my earlier post that script was mostly just to demonstrate the basic idea how quicly do something like this so you may not want to do it with bash. But you can apply same core idea in different languases.

I know I'm starting to be guilty of bit spoonfeeding here, but I hope you OP suprise me this time and show some proof of you making some research and try to make some coding here on your own.

EvilZone

News:

Author Topic: Email harvesting. (Read 1870 times)

hack3rcon

Email harvesting.

gray-fox

Re: Email harvesting.

proxx

Re: Email harvesting.

ram1r3z0

Re: Email harvesting.

khofo

Re: Email harvesting.

kenjoe41

Re: Email harvesting.

hack3rcon

Re: Email harvesting.

gray-fox

Re: Email harvesting.

hack3rcon

Re: Email harvesting.

gray-fox

Re: Email harvesting.