Date: Thu, 03 Sep 2015 21:29:37 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: SHA-1 H() On 2015-09-03 20:40, Solar Designer wrote: > On Thu, Sep 03, 2015 at 11:52:47AM +0200, magnum wrote: >> On 2015-09-03 06:56, Solar Designer wrote: >>> On Wed, Sep 02, 2015 at 09:31:34PM +0200, magnum wrote: >>>> #define Ch(x, y, z) (z ^ (x & (y ^ z))) >>>> #define Ch(x, y, z) ((x & y) ^ ( (~x) & z)) >>>> >>>> This is 3 vs. 4 ops, right? >>> >>> On archs without AND-NOT, yes. So it's a good find, and I'm happy you >>> patched these. >> Apparently GCN has ANDN and NAND. > > I need to take a fresh look at the arch manual, but in the generated > code I only see scalar ANDN, and never vector ANDN (nor NAND). They > defined scalar ANDN presumably because it's so useful for exec masks. > > I see you've committed this: > > +#if cpu(DEVICE_INFO) || amd_gcn(DEVICE_INFO) > +#define HAVE_ANDNOT 1 > +#endif > > but I think the check for amd_gcn(DEVICE_INFO) is wrong. We currently never run vectorized on GCN anyway, unless forced by user - if format supports it at all. But perhaps it should be (amd_gcn(DEVICE_INFO) && (V_WIDTH < 2)) then? > And why this change? - > > -#if !gpu_nvidia(DEVICE_INFO) || nvidia_sm_5x(DEVICE_INFO) > +#if !gpu_nvidia(DEVICE_INFO) > #define USE_BITSELECT 1 > #elif gpu_nvidia(DEVICE_INFO) > #define OLD_NVIDIA 1 > #endif I saw definite speedup for PBKDF2 and RAR iirc, and perhaps md5crypt. But later I saw contradicting figures for other formats so I'm not sure about this and things are in a state of flux. It might be that we should revert to initially setting it (for Maxwell) in opencl_misc.h, and later conditionally undefine it in certain formats. Is bitselect() expected to always generate a LOP3.LUT? Even if it is, I figure the optimizer just might be able to do better when given bitselect-free code. Besides all this, I see I introduced a bug: Now OLD_NVIDIA is defined for Maxwell and that was not the intention. I'll fix that right away. >> BTW early tests indicate that 5916a57 made SHA-512 very slightly worse >> (but almost hidden by normal variations). > > On what hardware? AVX and AVX2. My overall feeling is SHA256 got a slight boost while SHA512 did not and sometimes the latter got a very slight regression. But I haven't really gone systematic yet. All my tests are very inconclusive as of yet, the fluctuations are larger than the boosts/regressions. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.