Author Topic: [C++] Urlparse  (Read 2611 times)

0 Members and 1 Guest are viewing this topic.

Offline daxda

  • Peasant
  • *
  • Posts: 114
  • Cookies: 112
  • Not the guy you're looking for
    • View Profile
    • Daxda on Github
[C++] Urlparse
« on: April 17, 2014, 01:56:55 pm »
I've gained interest in C++ and lacked the idea of a project, so I thought I'd start out with a port
of the urlparse lib from Python. The functionality is pretty simple, I'll list the methods and their purpose below.


Library contents
ParseResult UrlParse::parse(string url)
    Returns a ParseResult object, it contains multiple parts, the names of these
    parts are: scheme, netloc, path, query and fragment.
    An example program:
    [gist]Daxda/54b543947916d5192fe5[/gist]   
    The output:
        Scheme: https
        Netloc: evilzone.org
        Path: /le/path/page.php
        Query: id=1&evil=1
        Fragment: important


string ParseResult::getUrl()
string UrlParse::getUrl(ParseResult parsed_url)
    Expands the parsed URL, an example of it's usage:
    [gist]Daxda/2eb81d6c3dac78c4036e[/gist]
    The output:
         https://evilzone.org/le/path/page.php?id=1&evil=1#important
         https://evilzone.org/le/path/page.php?id=1&evil=1#important


string UrlParse::join(string url, string location)
    This method joins an URL with the specified location, an example of it looks like this:
    [gist]Daxda/743e51c4d970d9c6019d[/gist]
    output, https://evilzone.org/level1/level3/page.php


void UrlParse::defrag(string url, string &new_url, string &fragment)
   Splits the URL at the fragment indicator (#) into two strings, new_url and fragment.
   Example program:
   [gist]Daxda/3845fbdd30538cb32e0b[/gist]
   Output of the program:
        Our URL without fragment: https://evilzone.org/page.php?evil=1
        The removed fragment: overview


Download
Obtain the full source of the libs and an example program from my Github repo


Feedback and critique
Please provide constructive feedback and critique if you have any, I do my best to improve my code and
I'd love to get multiple opinions about this project.
   
« Last Edit: April 23, 2014, 08:07:40 pm by daxda »

Offline Traitor4000

  • Knight
  • **
  • Posts: 191
  • Cookies: 8
    • View Profile
Re: [C++] Urlparse
« Reply #1 on: April 28, 2014, 03:20:11 am »
Nice job looked over the source to get some inspiration to continue to learn C++ thanks.
The most vulnerable part of an impenetrable system is those who believe it to be so.

Offline kenjoe41

  • Symphorophiliac Programmer
  • Administrator
  • Baron
  • *
  • Posts: 990
  • Cookies: 224
    • View Profile
Re: [C++] Urlparse
« Reply #2 on: April 28, 2014, 09:27:41 pm »
Just cloning the repo now. I am sure this can be useful in alot of ways so thanks.
Will look at the source anytime i get and twist your hand with whatever feedback i have.

Have some Ichuraki ramen with menma on top.
If you can't explain it to a 6 year old, you don't understand it yourself.
http://upload.alpha.evilzone.org/index.php?page=img&img=GwkGGneGR7Pl222zVGmNTjerkhkYNGtBuiYXkpyNv4ScOAWQu0-Y8[<NgGw/hsq]>EvbQrOrousk[/img]

Offline Stackprotector

  • Administrator
  • Titan
  • *
  • Posts: 2515
  • Cookies: 205
    • View Profile
Re: [C++] Urlparse
« Reply #3 on: May 02, 2014, 09:24:59 am »
Good work !!!! :DDDDD might use it in mah webserver one day.
~Factionwars

Offline bluechill

  • Cybermancer
  • Royal Highness
  • ****
  • Posts: 682
  • Cookies: 344
  • I am the existence in these walls
    • View Profile
Re: [C++] Urlparse
« Reply #4 on: May 03, 2014, 04:11:54 pm »
Couple things. Never use -1 instead if string::npos for find such. Also this could have been done in a couple lines with C++11 regex if you want to look into it.  Also your code will fail on urls such as this: www.google.com, http://www.google.com#fragment. A URL parser should not fail on those just saying.
I have dreamed a dream, but now that dream has gone from me.  In its place now exists my own reality, a reality which I have created for myself by myself.

Offline daxda

  • Peasant
  • *
  • Posts: 114
  • Cookies: 112
  • Not the guy you're looking for
    • View Profile
    • Daxda on Github
Re: [C++] Urlparse
« Reply #5 on: May 03, 2014, 05:38:40 pm »
Thanks for your feedback bluechill, been hoping to get some constructive brain juice. Will fix that soon :)

Edit: I tried parsing www.google.com, but I encountered no problems with it, can you elaborate what's wrong with it?
« Last Edit: May 03, 2014, 05:43:00 pm by daxda »

Offline bluechill

  • Cybermancer
  • Royal Highness
  • ****
  • Posts: 682
  • Cookies: 344
  • I am the existence in these walls
    • View Profile
Re: [C++] Urlparse
« Reply #6 on: May 04, 2014, 01:45:33 am »
Thanks for your feedback bluechill, been hoping to get some constructive brain juice. Will fix that soon :)

Edit: I tried parsing www.google.com, but I encountered no problems with it, can you elaborate what's wrong with it?

Maybe just with the fragment because looking at your code, fragments but no query should fail
I have dreamed a dream, but now that dream has gone from me.  In its place now exists my own reality, a reality which I have created for myself by myself.

Offline daxda

  • Peasant
  • *
  • Posts: 114
  • Cookies: 112
  • Not the guy you're looking for
    • View Profile
    • Daxda on Github
Re: [C++] Urlparse
« Reply #7 on: May 04, 2014, 06:47:42 am »
Alright, I fixed some bugs I encountered. I tested the lib with about 8 different URLs, so far all were parsed correctly.
The update has been pushed to my repo, feel free to git pull or git clone the repo now for an improved version.

Offline ArkPhaze

  • Peasant
  • *
  • Posts: 136
  • Cookies: 20
  • null terminated
    • View Profile
Re: [C++] Urlparse
« Reply #8 on: May 06, 2014, 04:39:42 am »
If you're using Windows, WinInet provides such functions already that will do some basics I believe. Nice manual work though. :)
« Last Edit: May 06, 2014, 04:43:42 am by ArkPhaze »
sig=: ArkPhaze

[ J/ASM/.NET/C/C++ - Software Engineer ]