Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 03 Jul 2015 23:48:16 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: extend SIMD intrinsics

On 2015-07-03 14:13, Lei Zhang wrote:
> I mean we should make it clear which intrinsics are support
> by all archs, e.g. a list like:
> ------
> vadd
> vand
> vload
> vstore
> vsll
> vsrl
> ...
> ------
>
> Those primitive intrinsics should be available in any decent SIMD
> arch, and can be used portably.
>
> The current situation is that, without such a list, we may risk
> losing portability when writing intrinsics. Imagine that, when I
> implement a format on a AVX2 laptop, I just look at the AVX2 section
> in pseudo_intrinsics.h and find vloadu and vshuffle_epi8 to be in the
> large list of supported intrinsics, so I add them to my code.
> Unfortunately this code won't work when I port it to MIC, because
> those two intrinsics are not in MIC's list.

Ah, yes. We have a common set, of which some are more or less emulated 
on some archs. That emulation may itself use intrinsics that are not 
among the common ones so can't be used outside that header file unless 
we add emulation for all other archs. I agree we should try to make it 
clear which ones are common.

> I think the easiest way to tackle this issue is to, for each arch,
> split the list of supported intrinsics into two parts: one part
> contains primitive intrinsics, and the other contains more "advanced"
> intrinsics. The set of primitive intrinsics for are the same for each
> arch and are always portable. The "advanced" intrinsics can be used
> for more optimized code, and need to be wrapped with #ifdefs in user
> code.

Agreed.

> I think it's no problem using pseudo-intrinsics, but some interfaces
> needs redesigning. For example, x86's __m512i is
> element-type-agnositc, but AltiVec is not. 'vector int' and 'vector
> long' are different types in AltiVec, and cannot be used
> interchangeably (unless explicit casting). Currently we use type
> (__m512i) pervasively in our code, and don't distinguish element
> types when declaring variables. This already caused me headaches when
> incorporating AltiVec intrinsics. Maybe we can define two different
> types, e.g. vtype32 and vtype64.

OK, so we should refactor vtype to vtype32, and add a vtype64. On intel 
they will be the same. Sounds good to me.

> I just found that some formats still use raw x86
> intrinsics and some use too advanced intrinsics to be found on
> non-x86 archs. Those all needs handling in order to support non-x86
> intrinsics.

There's core files like DES, which use their own kind of pseudo 
intrinsics I think. And there are a few formats in Jumbo that use their 
own stuff, be it pseudo or just stacks of ifdefs. I think it's GOST, 
Blake, Keccak, Scrypt and Pomelo. Pomelo will be replaced with 
Agneiszka's format sooner or later and perhaps Scrypt too. We'll see if 
you get time to have a look at Keccak & co but let's start with the ones 
that use pseudo-intrinsics.h.

magnum

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ