Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 03 Sep 2015 01:27:05 +0200
From: magnum <>
Subject: Re: SHA-1 H()

On 2015-09-02 17:52, Solar Designer wrote:
> On Wed, Sep 02, 2015 at 06:20:25PM +0300, Solar Designer wrote:
>> SHA-1's H() aka F3() is the same as SHA-2's Maj()
> And it turns out that while we appear to be optimally using bitselect()
> or vcmov() for Maj(), the fallback expressions that we use vary across
> source files and are not always optimal:
> (...)
> As you can see, some of these use 5 operations instead of 4, and some
> use the parallelism-lacking approach with possibly emulated vcmov().
> I think we should standardize on the parallelism-enabled 4 operation
> expression for when there's no native bitselect() or vcmov() - for both
> SHA-1 and SHA-2 in the same way.

All done (I changed some 4-op Ch() to 3-op as well). I mostly see slight 
boosts but some formats may show regression (they fluctuate). I 
committed this but need to test more, and on other hardware than my laptop.

> A curious aspect is that Maj() is invariant with respect to the ordering
> of its arguments.  We can see it in the grep output above: some of the
> expressions are the same except that they have x, y, z re-ordered in
> different ways.  We could test all 6 possible orderings in different
> contexts (SHA-1 vs. SHA-256 vs. SHA-512, and different OpenCL kernels,
> etc.) and see which is faster where (this might in fact differ).

Definitely, I've seen silly boosts/regressions just from doing that. 
It's annoying that the compiler can't figure it out for us - especially 
if it would turn out eg. different GPU's like different ordering.

> Attached to this message is a program I used to search for possible
> optimized expressions like this.  No new findings from it, but it did
> remind me of the issues I described in these two messages.  I was hoping
> it might find a 2 operation expression for MD5's I(), but no luck.
> It doesn't yet test two bitselect()'s per expression, though - this is
> worth adding and trying again (many possibilities to test there).

Would you care to explain what it does/outputs or do I need to reverse 
it? I don't quite get it.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.