Date: Sat, 14 Mar 2015 07:03:42 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: interleaved bitslice? (was: bitslice MD*/SHA*, AVX2) On Thu, Mar 12, 2015 at 08:37:01AM +0100, magnum wrote: > On 2015-03-11 21:55, Solar Designer wrote: > > solar@...l:~/md5slice$ gcc md5slice.c -o md5slice -Wall -s -O3 -fomit-frame-pointer -funroll-loops -DVECTOR -march=native > > > > This gave "warning: always_inline function might not be inlinable" about > > FF(), I(), H(), F(), add32r(), add32c(), add32() - but then it built > > fine. The speed is: > > Solar, > > While experimenting with this I noticed using a vector size of 32 but > still compiling for AVX gave a slight boost (~5%). I assume this ends up > similar to the interleaving we use in Jumbo, and is faster for the same > reasons. It might be, yes. However, when I tried that with "gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)", it produced scalar code (~10x slower). I guess this optimization occurs only with newer gcc, perhaps with same versions of gcc that are AVX2-capable. > When trying that with a vector size of 64, I trigger an ICE. > > md5slice.c: In function 'II.constprop': > md5slice.c:331:27: internal compiler error: in emit_move_insn, at > expr.c:3609 > static MAYBE_INLINE3 void II(a, b, c, d, x, s, ac) > ^ > > md5slice.c:331:27: internal compiler error: Abort trap: 6 > gcc: internal compiler error: Abort trap: 6 (program cc1) > > > That's with gcc-4.9.2 on OSX and it happens with -mavx2 too. I get a > similar but not identical ICE on well. Maybe this should be reported. Yes, it'd be good to report this. Will you, or should I? Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.