john-dev - Re: SHA-1 H()

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d2ae63d707d0428a64389d0314b76892@smtp.hushmail.com>
Date: Thu, 03 Sep 2015 11:52:47 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: SHA-1 H()

On 2015-09-03 06:56, Solar Designer wrote:
> On Wed, Sep 02, 2015 at 09:31:34PM +0200, magnum wrote:
>> On 2015-09-02 17:52, Solar Designer wrote:
>>> On Wed, Sep 02, 2015 at 06:20:25PM +0300, Solar Designer wrote:
>>>> SHA-1's H() aka F3() is the same as SHA-2's Maj()
>>>
>>> And it turns out that while we appear to be optimally using bitselect()
>>> or vcmov() for Maj(), the fallback expressions that we use vary across
>>> source files and are not always optimal:
>>
>> Perhaps Ch() too:
>>
>> #define Ch(x, y, z) (z ^ (x & (y ^ z)))
>> #define Ch(x, y, z) ((x & y) ^ ( (~x) & z))
>>
>> This is 3 vs. 4 ops, right?
>
> On archs without AND-NOT, yes.  So it's a good find, and I'm happy you
> patched these.
>
> However, on archs with AND-NOT either is 3 ops, and the one with AND-NOT
> has some parallelism.

Maybe the and-not one is better on some GPU then? I need to test. 
Apparently GCN has ANDN and NAND. Not sure about nvidia. I really hope 
we don't need a '(~x) & z' and a 'z & (~x)' version too?  Optimizers are 
usully fascinating but sometimes very disappointing.

> Maybe both forms of emulation need to be kept in pseudo_intrinsics.h
> with a way for us to choose one or the other.  It might happen that the
> optimal choice will vary by arch, CPU, compiler, format.

But if it varies by format, we need to decide outside pseudo_intrinsics.h.

BTW early tests indicate that 5916a57 made SHA-512 very slightly worse 
(but almost hidden by normal variations).

magnum

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.