Date: Thu, 4 Jun 2015 03:33:09 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: bitslice SHA-256 On Wed, Jun 03, 2015 at 06:57:33PM -0400, Alain Espinosa wrote: > I recall one thing. Bitslice SHA256 use only bitwise instructions, so it provides a big performance improvement on CPUs with only floating point bitwise operations, like AVX for example. That was my thought from a few years ago, but I am not aware of CPUs where those "floating point" bitwise operations are faster per-bit than their narrower width "integer" equivalents. On CPUs with AVX (as tested with our bitslice DES code on Sandy Bridge, Haswell, and Bulldozer), 256-bit AVX is twice slower per instruction and roughly same speed per bit as 128-bit AVX. On Pentium 3, SSE is roughly 3 times slower per instruction and 1.5 times slower per-bit than MMX. I did not specifically test this on Ivy Bridge and Piledriver, but I expect them to behave the same as Sandy Bridge and Bulldozer in this respect. Someone may test to make sure. As to Haswell, indeed we avoid the issue by using AVX2 there. When using AVX2, it is roughly same speed per instruction and twice faster per bit than 128-bit AVX, and roughly twice faster per instruction and twice faster per bit than 256-bit AVX. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.