Date: Sat, 1 Sep 2012 02:15:15 +0400 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com Subject: Re: JtR vs. hashcat on /r/crypto On Thu, Aug 30, 2012 at 02:32:52AM +0400, Alexander Cherepanov wrote: > On 2012-08-28 23:21, jfoug wrote: > >> 2. It turns out (was news to me) that hashcat added SunMD5 support > >> recently (on CPU). According to atom, it does not use SIMD, yet is > >> faster than ours with SIMD (JimF's unreleased code in magnum-jumbo). > >> I've asked atom for specific speed numbers, but we might want to do our > >> own benchmarks as well (Jim?), if we don't mind running the closed- > >> source hashcat for that. ;-) > > > > I have a strong belief the coin flip logic we have (the original sun logic), > > is where the speedup can be found. Yes, we did remove a %5 in one of the > > loops. But there still has to be a LOT of optimization left. There is a lot > > of temp memory usage, and memory movement. > > Indeed all those crazy arrays can easily be ditched. Patched is posted > to john-dev. Thank you! (For those not on john-dev: you wrote this gives +18%.) Meanwhile, the mystery may have been solved. r4d1x of team hashcat has kindly benchmarked hashcat's vs. magnum-jumbo's SunMD5 on his dual Xeon E5645 machine (12 cores, 24 logical CPUs), JtR built as linux-x86-64i. (This does not include the "+18%" speedup mentioned above yet.) Here are some speed numbers, posted with r4d1x' permission (thanks, r4d1x!) JtR using one core: *r4d1x* 413 c/s @ 120 seconds *r4d1x* single thread hashcat, ditto: *r4d1x* Speed/sec.: - plains, 103 words *r4d1x* @ 120 seconds Both were for length 7, lowercase letters + numbers, running against one SunMD5 hash (the same one). JtR MPI build (24 processes): *r4d1x* # mpirun -n 24 ./john --test --format=sunmd5 *r4d1x* Benchmarking: SunMD5 [128/128 SSE2 intrinsics 12x x576]... (24xMPI) DONE *r4d1x* Raw: 5276 c/s real, 5276 c/s virtual hashcat not limited to 1 thread (thus, should be 24 threads): *r4d1x* Speed/sec.: - plains, 1.74k words So magnum-jumbo is about 4x faster when using one core, and about 3x faster when using all logical CPUs (HT partially compensates for non-use of SIMD in hashcat). To be fair, I need to note that this is a released version of hashcat vs. unreleased JtR code (yet publicly available via git). It is possible that by the time we get around to including our SunMD5 code in a release, atom puts out a new version of hashcat with similar or better speedup. ;-) Or maybe not, because SunMD5 is pretty uncommon in the wild. IIRC, so far we had only one person posting to john-users mention cracking these hashes during a security audit. Also, hashcat's built-in multi-threading works very nicely, as compared to JtR's cumbersome MPI support (e.g., status reporting with MPI is nastier). We need to implement OpenMP support for SunMD5 hashes (right now it only exists for the old SunMD5 support on Solaris via the "generic crypt(3)" format, but that's system-specific and slow). Jim? :-) Anyhow, SIMD does provide the expected speedup. Thanks, Jim! To me, our successful SIMD'ing of SunMD5 primarily proves that data-dependent branching is not such a great idea to defeat this sort of attacks (as well as GPUs). If done differently, it could actually mitigate the attacks, but even then it would come with a risk of side channel leaks - not a good tradeoff, in my opinion, considering that there are other ways to defeat GPUs (and defeating SIMD is not even a good goal; it is better to use SIMD for defense). Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.