Date: Thu, 03 Sep 2015 11:52:47 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: SHA-1 H() On 2015-09-03 06:56, Solar Designer wrote: > On Wed, Sep 02, 2015 at 09:31:34PM +0200, magnum wrote: >> On 2015-09-02 17:52, Solar Designer wrote: >>> On Wed, Sep 02, 2015 at 06:20:25PM +0300, Solar Designer wrote: >>>> SHA-1's H() aka F3() is the same as SHA-2's Maj() >>> >>> And it turns out that while we appear to be optimally using bitselect() >>> or vcmov() for Maj(), the fallback expressions that we use vary across >>> source files and are not always optimal: >> >> Perhaps Ch() too: >> >> #define Ch(x, y, z) (z ^ (x & (y ^ z))) >> #define Ch(x, y, z) ((x & y) ^ ( (~x) & z)) >> >> This is 3 vs. 4 ops, right? > > On archs without AND-NOT, yes. So it's a good find, and I'm happy you > patched these. > > However, on archs with AND-NOT either is 3 ops, and the one with AND-NOT > has some parallelism. Maybe the and-not one is better on some GPU then? I need to test. Apparently GCN has ANDN and NAND. Not sure about nvidia. I really hope we don't need a '(~x) & z' and a 'z & (~x)' version too? Optimizers are usully fascinating but sometimes very disappointing. > Maybe both forms of emulation need to be kept in pseudo_intrinsics.h > with a way for us to choose one or the other. It might happen that the > optimal choice will vary by arch, CPU, compiler, format. But if it varies by format, we need to decide outside pseudo_intrinsics.h. BTW early tests indicate that 5916a57 made SHA-512 very slightly worse (but almost hidden by normal variations). magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.