Date: Tue, 28 Aug 2012 21:12:11 -0400 (EDT) From: jfoug@....net To: john-users@...ts.openwall.com Subject: Re: JtR vs. hashcat on /r/crypto On Tue, Aug 28, 2012 at 5:41 PM, Solar Designer wrote: > On Tue, Aug 28, 2012 at 02:21:07PM -0500, jfoug wrote: >>> From: Solar Designer [mailto:solar@...nwall.com] >>> >>> 2. It turns out (was news to me) that hashcat added SunMD5 support >>> recently (on CPU). According to atom, it does not use SIMD, yet is >>> faster than ours with SIMD (JimF's unreleased code in magnum-jumbo). >>> I've asked atom for specific speed numbers, but we might want to do >>> our >>> own benchmarks as well (Jim?), if we don't mind running the closed- >>> source hashcat for that. ;-) >> >> I have a strong belief the coin flip logic we have (the original sun >> logic), >> is where the speedup can be found. Yes, we did remove a %5 in one of >> the >> loops. But there still has to be a LOT of optimization left. There >> is a lot >> of temp memory usage, and memory movement. It 'could' be some other >> factor, >> but I really think not. This is why I was surprised by only a 3.5x >> improvement when going to SSE2 code. I expected a much higher rate, >> since >> we modify that large buffer so little. > > Well, it's 4.4x with XOP, but I wouldn't be surprised by a higher > speedup (over the original Sun code or equivalent) with further > optimizations on top of SIMD usage. What surprises me is that atom > says > he achieved greater speed "by not using SIMD". The SIMD was 'hard', due to having to find, load, process and later unwind the 2 different sized input buffers. However, since there was only a 16 byte block that was modified, this wind/unwind actually is very trivial to do (CPU cost wise). >> Possibly there is something Atom was able to find, that busted the >> coinflip, >> and found some way to compute it in a deterministic (or nearly >> deterministic) manner. > > Even if so, I don't see how that alone would provide more than a ~4x > speedup without going SIMD. Didn't the original Sun code use the > non-SIMD MD5 code fairly optimally (well, except for wasting a little > bit of time on the modulo division and such)? Not in my opinion. There is a lot of array loading of temp values. There HAS to be a much better way to work down to that 1 bit. > Maybe I need to take a closer look at the code myself, but for now > I'll > just wait to see the performance numbers for hashcat's SunMD5. > Perhaps > someone can try it out and post in here?
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.