Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 02 Apr 2015 18:05:26 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: New SIMD generations, code layout

On 2015-04-02 17:47, Lei Zhang wrote:
> I fixed the MIC intrinsics used in rawSHA256_ng and rawSHA512_ng. Now
> they can build and pass the self-tests.
> 
> rawSHA1_ng seems a bit troublesome because of the use of hardcoded
> lookup table. The table for AVX2 looks cumbersome enough. I can't
> imagine how the table for MIC looks like if defined in the same way.
> I tried to use bit shifts to make up the table, making it look like
> this:
> 
> #define X ((((uint128)0xFFFFFFFFFFFFFFFF)<<64) + 0xFFFFFFFFFFFFFFFF) 
> static const __aligned_simd uint128_t kUsedBytesTable[][4] = { {X<<
> 0, X<<  0, X<<  0, X<<  0}, {X<<  8, X<<  0, X<<  0, X<<  0}, {X<<
> 16, X<<  0, X<<  0, X<<  0}, ... }
> 
> This looks more compact but still cumbersome. I don't know if there's
> a better way.

Yes those tables are clever but might be excessive even for AVX2. We
might want to rewrite it with some other instructions (for those archs),
even if it's a couple of cycles slower.

> BTW, I have a question on how the lookup table is constructed. In
> kUsedBytesTable, from my observation, each subarray corresponds to a
> SIMD vector and those vectors are consecutively shifted left by one
> byte in order. But in the lower middle of the table, I find a "jump"
> that breaks my observation:
> 
> // for SSE
> static const __aligned_simd uint32_t kUsedBytesTable[][4] = {
> 	...
>         { 0x00000000, 0x00000000, 0xFF000000, 0xFFFFFFFF },
>         { 0x00000000, 0x00000000, 0x00000000, 0xFFFFFF00 },
> 	...
>     };
> 
> The lower subarray is supposed to be shifted left by one bytes from
> the upper subarray, but actually it's shifted left by two bytes. I
> don't know if this is a mistyping or something intentionally done.
> Could you give me some explanation?

I agree it looks like a bug and I can't see any reason it would be
intended. OTOH it's strange that the Test Suite hasn't catched it (it
should test all lengths - I guess it does not). I will investigate.

BTW I'm in the process of pseudo-fying sse-intrinsics.c too. The largest
obstacle is there are LOTS of hard-coded vector sizes in that file :-(
I think I will commit my WIP to a topic branch soon. Everything is in a
state of flux so we can postpone merging your changes.

magnum

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ