Date: Mon, 30 Mar 2015 18:09:30 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: New SIMD generations, code layout I just made a rough experimental version of raw-sha1-ng with AVX2 support (not committed). It's definitely worth it. But to the point, a question popped up. The code is now loaded with things like this: #if __AVX2__ __m256i Z = _mm256_setzero_si256(); __m256i X = _mm256_loadu_si256(key); __m256i B; uint32_t len = _mm256_movemask_epi8(_mm256_cmpeq_epi8(X, Z)); #else __m128i Z = _mm_setzero_si128(); __m128i X = _mm_loadu_si128(key); __m128i B; uint32_t len = _mm_movemask_epi8(_mm_cmpeq_epi8(X, Z)); #endif There will eventually be another clause for AVX512/AVX3 and, I think, a separate one for MIC. And sometimes separate details for SSE4.1, XOP and whatnot. Is this a sane way to go on? I'm not sure what else to do but I reckon we could create pseudo-intrinsics once and for all, like this: #if __AVX2__ typedef __m256i v_uint #define jtr_setzero_si _mm256_setzero_si256 #define jtr_loadu_si _mm256_loadu_si256 #define jtr_movemask_epi8 _mm256_movemask_epi8 #define jtr_cmpeq_epi8 _mm256_cmpeq_epi8 (...) #else typedef __m128i v_uint #define jtr_setzero_si _mm_setzero_si128 #define jtr_loadu_si _mm_loadu_si128 #define jtr_movemask_epi8 _mm_movemask_epi8 #define jtr_cmpeq_epi8 _mm_cmpeq_epi8 (...) #endif ...and then use these pseudo-intrinsics in the main code, which will be same for all variants. This has pros and cons, I'm not sure I like the idea. But is there some other good way to approach this "problem" or should we just keep adding clauses? I'm thinking now is a very good time to decide... Any ideas or comments are welcome, including from our GSoC candidates who will be working with this! magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.