EvilZone
Programming and Scripting => Scripting Languages => : DeXtreme September 12, 2013, 08:37:06 PM
-
I wrote this script to bruteforce Md5 hashes utilizing multiple processes to lighten the load on each and the main process. It works very well for less than five letter words after which the memory of the main process sykrockets and is killed by the system. Can anyone tell me what i did wrong or even better what how to fix it? ;D ;D
p.s Alert me if you find any redundant code..Debugging was a bitch
#import need modules
from multiprocessing import Process,Queue,current_process
from itertools import product
import hashlib
#global variables
charset="abcdefghijklmnopqrstuvwxyz"
procnum=3
maxlength=11
word=[]
#Intializing the queue and a the list for spawned processes
stopq=Queue()
processes=[]
#main function for hashing and comparing
def crackthread(swords,q):
global hashx
for i in swords:
cword="".join(i)
if hashlib.md5(cword).hexdigest()==hashx:
print("Found:%s" %(cword))
q.put("found")
break
print current_process().name," ending"
#Generates and splits the list of possible permutations
def calc(length):
global processes
global stopq
global procnum
processes=[]
cwords=[]
cwords=product(charset,repeat=length)
cwords=list(cwords)
ports=int(len(cwords)/procnum)
z=0
for i in range(0,procnum):
try:
p=Process(target=crackthread,args=(cwords[z:z+ports],stopq))
print "Created",p.name
processes.append(p)
z=z+ports
except:
p=Process(target=crackthread,args=(cwords[z:-1],stopq))
processes.append(p)
break
hashx=raw_input("Hash:")
print "Working"
for i in range(1,maxlength):
if stopq.empty():
calc(i)
print "Starting"
for i in processes:
i.start()
print "Joining"
for i in processes:
i.join()
else:
break
for i in processes:
print "Ending"
i.terminate()
if stopq.empty():
print "Not Found"
-
Don't ever use threads in Python to speed up a task. It won't work because of the global interpreter lock, meaning you will always execute only one instruction at a time no matter how many threads there are. The only thing you gain from it is more stuff to calculate to handle the threading.
I don't know if this is the issue you are talking about, but I am reluctant to look at your code, because you need to remove the threading anyways which might solve the problem if it is related to the thread usage. Rewrite your code and if you still have the problem, write again.
-
I know threads are slow...That's why i spawned processes instead :/
-
Having weired deja vu with this thread.
-
Alright. Look at what you are doing here:
cwords=product(charset,repeat=length)
cwords=list(cwords)
product creates a generator, which is fine. But right after you turn it into a list, forcing it to calculate all of the words and holding them in memory at once. And these are a lot.
You can get the number of words by using some math knowledge about the size of the cartesian product instead of producing every word beforehand in order to count them.
--------------------------------------
Edit: I looked it up and yes, it seems processes are not affected by the GIL. Yet, you should still measure the time, because processes are more heavyweight and it might be that the gain of speed is not enough to make up for the computing you need to manage the processes.
-
Alright. Look at what you are doing here:
cwords=product(charset,repeat=length)
cwords=list(cwords)
product creates a generator, which is fine. But right after you turn it into a list, forcing it to calculate all of the words and holding them in memory at once. And these are a lot.
You can get the number of words by using some math knowledge about the size of the cartesian product instead of producing every word beforehand in order to count them.
--------------------------------------
Edit: I looked it up and yes, it seems processes are not affected by the GIL. Yet, you should still measure the time, because processes are more heavyweight and it might be that the gain of speed is not enough to make up for the computing you need to manage the processes.
Yeah thanks..i realised that too.however i'm stumped on how to make the generator available to each process.