Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 1 Apr 2015 12:15:54 +0800
From: Lei Zhang <>
Subject: Re: New SIMD generations, code layout

> On Mar 31, 2015, at 12:09 AM, magnum <> wrote:
> I just made a rough experimental version of raw-sha1-ng with AVX2
> support (not committed). It's definitely worth it. But to the point, a
> question popped up. The code is now loaded with things like this:
> #if __AVX2__
>    __m256i  Z   = _mm256_setzero_si256();
>    __m256i  X   = _mm256_loadu_si256(key);
>    __m256i  B;
>    uint32_t len = _mm256_movemask_epi8(_mm256_cmpeq_epi8(X, Z));
> #else
>    __m128i  Z   = _mm_setzero_si128();
>    __m128i  X   = _mm_loadu_si128(key);
>    __m128i  B;
>    uint32_t len = _mm_movemask_epi8(_mm_cmpeq_epi8(X, Z));
> #endif

I just tried to add MIC support to rawSHA256_ng, but the file seems a bit hardcoded for SSE and I have to write "#ifdef __MIC__ {...}" (like the code above) everywhere. It almost feels like I'm rewriting the whole file, copying the original code and then replacing every occurrence of "_mm256" with "_mm512". I don't feel this is the right way to go. I guess other files that use SSE intrinsics are more or less the same case. I'm curious how magnum handled this when adding AVX2 support. Is there a better way without using pseudo-intrinsics?

Maybe we can start implementing the pseudo-intrinsics now. Those used in DES_bs_b.c make a good reference, but not comprehensive enough. What's your opinion? I may start doing this if it's appropriate.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.