EvilZone

Programming and Scripting => Scripting Languages => : lucid February 05, 2014, 03:57:21 AM

: [Perl]Grab Page With Socket
: lucid February 05, 2014, 03:57:21 AM
Any particular reason why this doesn't return anything? It also prints no errors either.

: (perl)
#!/usr/bin/perl

use strict;
use warnings;
use IO::Socket;

my $socket = IO::Socket::INET->new(
   PeerAddr => 'www.google.com',
   PeerPort => 'http(80)',
   Proto    => 'tcp',
) or die 'Unable to get site';

print $socket "GET / HTTP/1.1";

Don't really have anything else for you, sorry.
: Re: [Perl]Grab Page With Socket
: vezzy February 05, 2014, 05:54:50 AM
You're not sending a proper HTTP GET request.

As per RFC 2616, section 5 (http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html (http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html)):

5 Request

A request message from a client to a server includes, within the first line of that message, the method to be applied to the resource, the identifier of the resource, and the protocol version in use.

        Request    = Request-Line              ; Section 5.1
                        *(( general-header        ; Section 4.5
                         | request-header         ; Section 5.3
                         | entity-header ) CRLF)  ; Section 7.1
                        CRLF
                        [ message-body ]          ; Section 4.3

5.1 Request-Line

The Request-Line begins with a method token, followed by the Request-URI and the protocol version, and ending with CRLF. The elements are separated by SP characters. No CR or LF is allowed except in the final CRLF sequence.

        Request-Line   = Method SP Request-URI SP HTTP-Version CRLF

Therefore the working code would be:

: (Perl)
#!/usr/bin/perl

use strict;
use warnings;
use IO::Socket;

my $socket = IO::Socket::INET->new(
   PeerAddr => 'www.google.com',
   PeerPort => 'http(80)',
   Proto    => 'tcp',
) or die 'Unable to get site';

print $socket "GET / HTTP/1.1\r\n";
print $socket "\r\n";
print my $rx_line =  <$socket>;

The final line prints only the first line from the server response (the HTTP 200 OK status).
: Re: [Perl]Grab Page With Socket
: techb February 05, 2014, 06:06:24 AM
From a drunken stupor with memes on my mind, I been wanting to do this for a while now:
(http://i.imgur.com/nGqmQXX.jpg)


Lol, I don't have anything else to give this thread. So you went with Perl eh Lucid?
: Re: [Perl]Grab Page With Socket
: lucid February 05, 2014, 06:43:56 AM
I knew I was missing something. I know there's tons of this code on the web, but I wanted to do it without looking at anyone else's code otherwise it'd be too hard not to copy and not learn anything.

I see I was misunderstanding what this was supposed to do however. I was attempting to grab the home page, like what this code does:

: (perl)
#!/usr/bin/perl

use strict;
use warnings;
use LWP::Simple;

my $page = get('http://evilzone.org') or die 'Unable to get site';
print $page;
I see I need to read more about using sockets with perl. Thanks for your input vezzy.
: Re: [Perl]Grab Page With Socket
: vezzy February 05, 2014, 06:53:24 AM
If your intention is web scraping, then yeah, raw sockets are too minimalistic.
: Re: [Perl]Grab Page With Socket
: lucid February 05, 2014, 06:56:32 AM
Indeed they would be it seems. Not impossible though I imagine. Still it's obviously unecessary. Thanks vezzy.