Date: Wed, 21 Mar 2012 06:47:28 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: SSSE3 PSHUFB (was: AMD Bulldozer and XOP) On Thu, Mar 15, 2012 at 11:46:09AM +0200, Milen Rangelov wrote: > Actually the quoted improvement percentage was not correct. I did some more > improvements (like e.g using SSE3 shuffle to speed up the byte order > reversals in SHA1 and optimizing a bit the early checks). Regarding "using SSE3 shuffle to speed up the byte order reversals", do you actually mean SSSE3 (not SSE3) and its PSHUFB instruction? Is this something you'd possibly be willing to implement and contribute for JtR as well? Sounds like a good way for you to get involved in our project more directly. ;-) BTW, I just realized how very powerful PSHUFB is. It's not just a shuffle. It's 16 parallel 4-to-4 array lookups, usable e.g. for 16 parallel S-box lookups. It could even compete with bitslice DES, or even if it'd lose to bitslice DES in terms of speed, it could allow for a very fast non-bitslice DES or 3DES implementation, where we readily have 8 6-to-4 S-box lookups (or 32 4-to-4 lookups) to make in just one instance. It would be usable e.g. to encrypt just one data stream sequentially while meeting an existing standard, where a bitslice implementation would not be usable (we have no such task in JtR currently, but I imagine that it'd be helpful e.g. in some IPSEC implementation). We could try it for DES and for Lotus5. Also, a new KDF built upon PSHUFB would be friendly to recent CPUs with that instruction, but GPU-unfriendly (until/unless a GPU has something similar - which I am not aware of). So it could be used to slow down GPU-based offline attacks. Thanks, Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.