Date: Sun, 1 Jul 2012 0:23:22 -0400 From: <jfoug@....net> To: john-dev@...ts.openwall.com Cc: magnum <john.magnum@...hmail.com> Subject: Re: SHA2 added to bleeding ---- magnum <john.magnum@...hmail.com> wrote: > > ...and as implemented, this will only kick in if your OpenSSL version > does not support SHA-2 at all, so *any* speed is good speed. So > obviously I think this is very good stuff. Agreed, but it would be nice to get the same speed as oSSL, because there is quite a bit to gain, by exposing the lower levels of the crypt (my code does). oSSL exposes little lower than crypt_Update() Having the high level usability of oSSL ctx model is very nice. It allows 'acceptable' code speed, with easy of implementation. However, exposing the low levels, and then doing things like properly maintaining the buffers, and maintaining (or ignoring) the endianity changes can significantly boost the speed. A format like sha256/512crypt is a prime example of this (which I will be working on shortly). In this format, we are probably spending 50% of the time, in buffer maintenance, by using the high level code of the CTX model. What we should be doing, is to prepare the 42 buffers (2 * 3 * 7) upfront, then simply call the block_crypt for 1 or 2 buffers, take the results from that, and slam it into the next buffer in the right place. If done right, there should even be no endianity switchup from one iteration to the next. Now, oSSL does not expose a low enough level to do this. The 'generic' code I am doing does, as will the SSE code (if modeled after our existing MD5/SHA1 intrinsic coding). There is as much performance gain to be had in these formats, as there is in porting over to SIMD code. But, before we could delve into that, we need to get the generic code running at the speed (or better or 'almost' as good) as oSSL, on many different systems (all if we can). Jim.
Powered by blists - more mailing lists
Powered by Openwall GNU/*/Linux - Powered by OpenVZ