Date: Tue, 22 Nov 2011 13:52:49 -0600 From: "jfoug" <jfoug@....net> To: <john-dev@...ts.openwall.com> Subject: RE: SHA1 SSE2i R&D work >>Good stuff. What's the gain for dcc2? >>magnum > >Still POC. I have only made changes in the easiest of the formats (raw- >sha1). I have not started on any other format yet. $ ../run/john -test=5 -form=mscash2 Benchmarking: M$ Cache Hash 2 (DCC2) [SSE2i 8x]... DONE Raw: 342 c/s $ ../run/john -test=5 -form=mscash2 Benchmarking: M$ Cache Hash 2 (DCC2) [SSE2i type-2 8x]... DONE Raw: 506 c/s So, for the '.c' build, I get about 48% improvement. Again, if we can speed things up with PARA=3, it may be even better. Also, note, the 32 bit .S builds run about 650/s on this machine (IIRC). However, the speed improvement of intrinsics built with ICC vs GCC on this machine is significant, so I am very sure that the proper sse-intrinsic-32.S will significantly outperform the sha1-mmx.S code. Again, right now, the mscash2 has a single line that can be commented out, and the code will build with 80 DWORD input buffer, or with 16 DWORD input buffer layout. It only takes a few minutes to port. Once we are happy, I figure we will drop the older 80 DWORD arrays, for the SSE2i, but that is done be simply deleting a couple of lines. Most of the porting is needed to keep the 32 bit sha1-mmx.S code working properly. I do NOT have any plans on porting the sha1-mmx.S code, now that we have ICC able to build a .S file that works, and is comparable in speed (possibly much faster now, for SHA1). I know it is already faster most of the time to use SSE2i vs the 32 bit asm code. It may be time to move away from that older hand coded .S code. A big problem with it, is global variables, thus no thread safety. The SSE2i has no such problems. Jim.
Powered by blists - more mailing lists
Powered by Openwall GNU/*/Linux - Powered by OpenVZ