Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 31 Mar 2015 09:21:51 +0800
From: Lei Zhang <zhanglei.april@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: New SIMD generations, code layout


> On Mar 31, 2015, at 12:09 AM, magnum <john.magnum@...hmail.com> wrote:
> 
> #if __AVX2__
> typedef __m256i           v_uint
> #define jtr_setzero_si    _mm256_setzero_si256
> #define jtr_loadu_si      _mm256_loadu_si256
> #define jtr_movemask_epi8 _mm256_movemask_epi8
> #define jtr_cmpeq_epi8    _mm256_cmpeq_epi8

I think this makes sense. But isn't this pattern already in good use in JtR? I see something like this in DES_bs_b.c:

 #if defined(__AVX__)
 typedef __m256 vtype;
 #define vst(dst, ofs, src)   _mm256_store_ps((float *)((DES_bs_vector *)&(dst) + (ofs)), (src))
 #define vxorf(a, b)  _mm256_xor_ps((a), (b))

This seems to me the pseudo-intrinsics you want. So I guess the problem is that, in JtR, some files use pseudo-intrinsics while some not? If this is the case, IMHO, it would be better to use a unifying definition of all the intrinsics, just as you suggested. Otherwise we'll have to put those CPU detection code in every file that uses CPU-specific intrinsics.


Lei

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ