Date: Wed, 02 Oct 2013 18:56:17 +0200 From: magnum <john.magnum@...hmail.com> To: "john-dev@...ts.openwall.com" <john-dev@...ts.openwall.com> Subject: Re: "AVX 4x" instead of "AVX 8x" On 2013-10-02 09:26, Dhiru Kholia wrote: > well.openwall.net shows "AVX 8x" when bench-marking the RAKP format. > > However, my Haswell system running Fedora 20 shows "AVX 4x". Pretty > weird, right? > > Any insights into what is going on here? It's not weird at all :) I'll send this to john-dev in case someone else has missed this. In the end it means how many keys are processed by a single call to the SSE2 SHA1 function. There are a chain of #ifdefs in x86-64.h (or x86-sse.h for 32-bit) that chooses SHA1_SSE_PARA (and similar macros for MD4 and MD5) depending on what compiler you run, and what version. Because different compilers perform better at different values. SHA1_SSE_PARA controls how much interleaving sse-intrinsics.c builds into the SSE2/AVX/XOP SHA1 function. This interleaving hides latency if the compiler can schedule instructions well. Intel's icc is very good at this. So for SHA1_SSE_PARA=1 you get 4x and for SHA1_SSE_PARA=2 you get 8x (SSE2, AVX or XOP are all 4x [we call that MMX_COEF], multiplied with the SHA1_SSE_PARA of 2 == 8x). If you try a new compiler (eg. a future gcc-4.9) you can run "make testpara" or "make testpara-native" and see what extra #ifdefs we should use in x86-64.h. Future AVX2 functions might get an MMX_COEF of 8 instead of 4. Then you'll get "AVX2 16x" provided you use SHA1_SSE_PARA=2. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.