Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [day] [month] [year] [list]
Date: Sun, 26 May 2013 16:45:20 -0400
From:  <>
Subject: Initial work on cryptsha256 - SSE

Not getting great improvement.  Some, but not as much as I know I can get.  There is a TON of time wasted in swapping out of, then back into, then back out of BE.  I need to work some more on this, and set the whole crypt struct to BE, 'interleaved' MMX format and simply keep it there, updating the buffers where the crypts are written to.   I imagine there is likely close to  a 2x improvement still to be gained, by getting the data into proper format, and keeping it that way.  But at least, this is a start.  I am likely going to still find bugs in the logic of the code, needing worked out.  This is simply the first where I have gotten the self test to work properly, and before any 'real' optimizations.  Also, this is SSE code built by 32 bit cygwin, which only builds code about 65% the speed it should (compared to decent compilers, new GCC 64 bit, or ICC 64 or 32 bit).

$ ../run/john -test=10 -form=sha256crypt
Benchmarking: sha256crypt, sha256crypt (rounds=5000) [128/128 SSE2 intrinsics 4x]... DONE
Raw:    353 c/s real, 355 c/s virtual

$ ../run/john -test=10 -form=sha256crypt
Benchmarking: sha256crypt, sha256crypt (rounds=5000) [32/32 generic]... DONE
Raw:    209 c/s real, 210 c/s virtual

So at this time, it would likely get 500/s with a better compiler for the sse-intrinsic code, and I am hoping to get this to the 800/1000 /s once I get 'proper' interleaved  MMX-COEF / BE format. I am not ready do check this into git just yet.  I am doing modifications to sse-intrisics.c and want to make sure I have them all, before checking in.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.