It's great that you solved it!
However just for clarity I would like to also suggest (and strengthen)
white-knight's suggestion of using oclHashCat (cudaHashCat in your case) as GPU's can crack hashes orders of magnitude faster than CPU's.
Do note however that the syntax of cudaHashCat, although very very similar, is not exactly identical to the one of the standard HashCat, so certain changes will have to be made.
For instance, 3c439fe413fee66d50d46846b56c022c is the md5 hash of "AString".
Cracking this with cudaHashCat (this is from Windows, but the syntax is the same), assuming certain knowledge (such as that the original string is 7 characters long), would look like this:
cudaHashcat64.exe -m 0 -w 3 -a 3 680635dee5365c3a0aa55c6dc7bc86db -1 ?l?u?d ?1?1?1?1?1?1?1
-m 0A value of 0 means the hash is MD5. A table of the values for each type of hash can be found
here.
-w 3Maximum performance out of the GPU. With this value my desktop PC starts framing hard in Windows (mouse movements) while cracking. 99% resource usage goes to cudaHashCat.
-a 3 Use a mask-attack.
680635dee5365c3a0aa55c6dc7bc86dbSelf explanatory.
-1 ?l?u?dYou can set up to 4 custom charsets. Each charset can occupy one or more positions in the mask. In the above link you can see that ?l stands for lowercase letters, ?u stands for uppercase letters and ?d stands for digits. How this is used is explained below.
?1?1?1?1?1?1?17 times the "1" charset. This means that the original string had 7 characters, and that each of those characters can be either a lowercase letter (?l), uppercase letter (?u) or a digit (?d).
On my laptop (which is way, way, way weaker than my desktop) cracking the MD5 hash from above, with the mask I mention (i.e. we don't iterate from 2 chars to 7 chars, we just do a mask attack on the 7 char space) had the following results:
Session.Name...: cudaHashcat
Status.........: Cracked
Input.Mode.....: Mask (?1?1?1?1?1?1?1) [7]
Hash.Target....: 680635dee5365c3a0aa55c6dc7bc86db
Hash.Type......: MD5
Time.Started...: Thu Oct 29 11:55:36 2015 (36 secs)
Speed.GPU.#1...: 799.7 MH/s
Recovered......: 1/1 (100.00%) Digests, 1/1 (100.00%) Salts
Progress.......: 28991029248/3938980639167 (0.74%)
Rejected.......: 0/28991029248 (0.00%)
Restore.Point..: 0/15752961 (0.00%)
HWMon.GPU.#1...: 0% Util, 51c Temp, N/A Fan
Started: Thu Oct 29 11:55:36 2015
Stopped: Thu Oct 29 11:56:13 2015
For reference, the video card is an NVIDIA Quadro K2100M, so nothing too special.
If you want to do a more serious benchmark, crack an MD5 hash with exactly the same settings, only let it iterate char spaces (don't set it fixed, let it bruteforce it) and you'll see MAJOR differences in performance. What takes minutes with the GPU can take hours with the CPU.
The command in oclHashCat, as
nrael mentioned above is basically identical with one extra parameter:
cudaHashcat64.exe -m 0 -w 3 -a 3 680635dee5365c3a0aa55c6dc7bc86db -1 ?l?u?d ?1?1?1?1?1?1?1 --increment
So as you can see the syntax between HashCat and oclHashCat is almost identical, with small differences. Just be a bit watchful and you'll have it in no time!
Good luck!