Author Topic: [python] write better code  (Read 1049 times)

0 Members and 1 Guest are viewing this topic.

Offline nrael

  • Peasant
  • *
  • Posts: 66
  • Cookies: -7
    • View Profile
[python] write better code
« on: April 02, 2014, 09:47:22 pm »
Hi,

as I'm learning python and I have some smaller projects I want you to ask to help me write better code...

first project: linkchecker.py
Code: [Select]
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import requests
import re

f = open('urls.txt','r')
a = open('checked.txt','w')


# finds 4xx || 5xx
regex = re.compile('[4|5][0-9]{2}')

for line in f:
  if not line.startswith('http'):
      print line
      a.write(line)
  else:
    try:
        r = requests.head(line.strip())
        print r.status_code
        print line
        if not regex.match(str(r.status_code)):
            a.write(line)
    except requests.exceptions.ConnectionError as e:
        print("error exception...")
        pass

f.close()
a.close()
« Last Edit: April 11, 2014, 09:21:07 pm by nrael »

Offline kenjoe41

  • Symphorophiliac Programmer
  • Administrator
  • Baron
  • *
  • Posts: 990
  • Cookies: 224
    • View Profile
Re: [python] write better code
« Reply #1 on: April 02, 2014, 10:00:00 pm »
Blotting in C++, i had forgotten how easy python is.
I am guessing you are using windows cos you didn't put a shibang at the start of the code. It is wiser to do it for code portability since it save *nix alot of complaints and tells it where to find the python binary. 
Your variables could use a little self explaining as in verbosity than 'a', 'f' tho am okay with the requests 'r' variable but i tend to use 'res'.
I am not sure which mode you opened 'f' in, read, write or  both? Careful.
Do you realise you never closed any of both the files you open? That is bad news.
It is good practice to print you exceptions so as to provide for better debugging.

Also assuming this is python 3, write more code and read more.
« Last Edit: April 02, 2014, 10:04:24 pm by kenjoe41 »
If you can't explain it to a 6 year old, you don't understand it yourself.
http://upload.alpha.evilzone.org/index.php?page=img&img=GwkGGneGR7Pl222zVGmNTjerkhkYNGtBuiYXkpyNv4ScOAWQu0-Y8[<NgGw/hsq]>EvbQrOrousk[/img]

Offline nrael

  • Peasant
  • *
  • Posts: 66
  • Cookies: -7
    • View Profile
Re: [python] write better code
« Reply #2 on: April 02, 2014, 11:36:02 pm »
hey thanks for the answers and your help!

added #!/usr/bin/env python and the close() for both files (haven't realised...)

default file open is read, that's why I haven't added the r option.
It does support https, it checks for 'http' and 'https' is also in 'http', or what do you mean?

about the http status codes, I'll think about and will edit later.






Offline kenjoe41

  • Symphorophiliac Programmer
  • Administrator
  • Baron
  • *
  • Posts: 990
  • Cookies: 224
    • View Profile
Re: [python] write better code
« Reply #3 on: April 03, 2014, 01:02:39 am »
I would suggest that you take a step back and look at thi in another way. Don't look at it as solving today problem and that is the end of it, cos you ain't gonna come everyday for us to search for your errors and point them out for you.
If it is regex, go read about it at probaly you may never ak about it again. If it is I/O operations, read about it and ots different attributes and you may never ask about it.
We do it today but tomorrow you will have to be your own man and that is gonna come by reading and coding alot, searching and when all fails, ask.
If you can't explain it to a 6 year old, you don't understand it yourself.
http://upload.alpha.evilzone.org/index.php?page=img&img=GwkGGneGR7Pl222zVGmNTjerkhkYNGtBuiYXkpyNv4ScOAWQu0-Y8[<NgGw/hsq]>EvbQrOrousk[/img]

Offline nattewasbeer

  • /dev/null
  • *
  • Posts: 5
  • Cookies: 1
  • h4xh4x
    • View Profile
Re: [python] write better code
« Reply #4 on: April 10, 2014, 08:10:33 pm »
hey thanks for the answers and your help!

added #!/usr/bin/env python and the close() for both files (haven't realised...)

You need to close them before the for loop starts. Nothing actually happens now. The file operations would've been closed at the end of the script anyway.
« Last Edit: April 10, 2014, 08:30:56 pm by nattewasbeer »
~~~~~

Offline techb

  • Soy Sauce Feeler
  • Global Moderator
  • King
  • *
  • Posts: 2350
  • Cookies: 345
  • Aliens do in fact wear hats.
    • View Profile
    • github
Re: [python] write better code
« Reply #5 on: April 10, 2014, 08:19:59 pm »
With handling files, it's better to use the "with" statement. That way you don't have to worry about closing handles, even if something crashes, it'll handle closing them.

Example:
Code: (python) [Select]
with open("somefile.txt", "w") as f:
    #do file stuff
    f.write("Hello, World!")
>>>import this
-----------------------------

Offline nrael

  • Peasant
  • *
  • Posts: 66
  • Cookies: -7
    • View Profile
Re: [python] write better code
« Reply #6 on: April 11, 2014, 09:27:11 pm »
updated code...

I've added now the regex catch and had to fix the \n issue with line.strip()

about the close() , I can't close them before the for loop, because I want to use them... or how should it look like?

and with the "with open()...", I'll use this next time when I handle files. I'd need here still a for loop for reading line by line with "with open" ?

Offline nattewasbeer

  • /dev/null
  • *
  • Posts: 5
  • Cookies: 1
  • h4xh4x
    • View Profile
Re: [python] write better code
« Reply #7 on: April 12, 2014, 06:22:47 am »
I'm guessing you want to check if the urls from urls.txt are working.

Also this code doesn't allow for checking things like 500's, the whole class of 400 error codes, etc. You'd be better off checking against a regex of [2|3][0-9]{2} to catch everything from 200 to 399 all of which are successful links. Also, why no support for https?

Dafaq?

About the code...

+1 for using requests.

Code: (python) [Select]
    except requests.exceptions.ConnectionError as e:
        print("error exception...")
        pass
There is no reason for a pass statement here. It would've passed anyway.

Code: (python) [Select]
f = open('urls.txt','r')
a = open('checked.txt','w')
You are right about not being able to close variable 'a' since you need to write to it. 'f' can be closed straightaway however.

Code: (python) [Select]
for line in f:
  if not line.startswith('http'):
      print line
      a.write(line)
  else:
:/

Code: (python) [Select]
if not regex.match(str(r.status_code)):
            a.write(line)
A big WTF here. r.status_code is an integer. Why convert it to a string to check if it is between 400 or 599.

I came up with this:
Code: (python) [Select]
#!/usr/bin/python
import re, requests

f = open('urls.txt','r').readlines()
a = open('checked.txt','w')

misformed_uris = [z.strip() for z in f if not z.startswith('http')]
good_uris = [z.strip() for z in f if z.strip() not in misformed_uris]

[a.write('misformed: %s\n' % z) for z in misformed_uris]

headers = {'User-Agent': 'Bla','Referer': 'http://www.google.com'}

for uri in good_uris:
    try:
        res = requests.head(uri, timeout=4,
                                 headers=headers,
                                 allow_redirects=True,
                                 verify=False)

        a.write('ok: %s\n' % res.url) if res.status_code == 200 \
            else a.write('ok: %s\n' % res.url)

    except requests.exceptions.ConnectionError as e:
        a.write('err: %s\n' % uri)

Works fine. If your urls.txt is this
Code: [Select]
oidnh
http://cnn.com
egmewpogh
http://blsdgiosdgh.com
wpoe3
http://evilzone.org

checked.txt will be this:
Code: [Select]
misformed: oidnh
misformed: egmewpogh
misformed: wpoe3
ok: http://edition.cnn.com/
err: http://blsdgiosdgh.com
ok: https://evilzone.org/
« Last Edit: April 12, 2014, 10:00:42 pm by nattewasbeer »
~~~~~