1
Scripting Languages / Re: [python] write better code
« on: April 12, 2014, 06:22:47 am »
I'm guessing you want to check if the urls from urls.txt are working.
Dafaq?
About the code...
+1 for using requests.
I came up with this:
Works fine. If your urls.txt is this
checked.txt will be this:
Also this code doesn't allow for checking things like 500's, the whole class of 400 error codes, etc. You'd be better off checking against a regex of [2|3][0-9]{2} to catch everything from 200 to 399 all of which are successful links. Also, why no support for https?
Dafaq?
About the code...
+1 for using requests.
Code: (python) [Select]
except requests.exceptions.ConnectionError as e:
print("error exception...")
pass
There is no reason for a pass statement here. It would've passed anyway.Code: (python) [Select]
f = open('urls.txt','r')
a = open('checked.txt','w')
You are right about not being able to close variable 'a' since you need to write to it. 'f' can be closed straightaway however.Code: (python) [Select]
for line in f:
if not line.startswith('http'):
print line
a.write(line)
else:
:/Code: (python) [Select]
if not regex.match(str(r.status_code)):
a.write(line)
A big WTF here. r.status_code is an integer. Why convert it to a string to check if it is between 400 or 599.I came up with this:
Code: (python) [Select]
#!/usr/bin/python
import re, requests
f = open('urls.txt','r').readlines()
a = open('checked.txt','w')
misformed_uris = [z.strip() for z in f if not z.startswith('http')]
good_uris = [z.strip() for z in f if z.strip() not in misformed_uris]
[a.write('misformed: %s\n' % z) for z in misformed_uris]
headers = {'User-Agent': 'Bla','Referer': 'http://www.google.com'}
for uri in good_uris:
try:
res = requests.head(uri, timeout=4,
headers=headers,
allow_redirects=True,
verify=False)
a.write('ok: %s\n' % res.url) if res.status_code == 200 \
else a.write('ok: %s\n' % res.url)
except requests.exceptions.ConnectionError as e:
a.write('err: %s\n' % uri)
Works fine. If your urls.txt is this
Code: [Select]
oidnh
http://cnn.com
egmewpogh
http://blsdgiosdgh.com
wpoe3
http://evilzone.org
checked.txt will be this:
Code: [Select]
misformed: oidnh
misformed: egmewpogh
misformed: wpoe3
ok: http://edition.cnn.com/
err: http://blsdgiosdgh.com
ok: https://evilzone.org/