Date: Wed, 9 Sep 2015 03:00:01 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: md5crypt mmxput*() On 2015-09-08 16:18, Solar Designer wrote: > On Tue, Sep 08, 2015 at 01:17:14PM +0300, Solar Designer wrote: >> Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 XOP 4x2]... (8xOMP) DONE >> Raw: 231424 c/s real, 28928 c/s virtual > >> I think further speedup is possible by using a switch statement to make >> the shift counts into constants (we have an if anyway, we'll just >> replace it with a switch) like cryptmd5_kernel.cl has. > > I cleaned up the code and implemented switch - patch attached. > It turned out to cause a minor performance regression on bull (due to > code size growth maybe?) so I am disabling it for XOP and keep the > performance almost the same as above: > > Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 XOP 4x2]... (8xOMP) DONE > Raw: 231680 c/s real, 28923 c/s virtual Code size, eh? This reminded me there is a "#pragma GCC optimize 3" in that file that I always found slightly dubious. We should verify how each format reacts to dropping that. Quick test for now, on bull; Enabled the switch for XOP, dropped that pragma: Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 XOP 4x2]... (8xOMP) DONE Raw: 233472 c/s real, 29184 c/s virtual The total file size actually increased but that might be other parts getting larger (though I'm not sure why anything would become larger?). $ ls -lrt simd-intrinsics*o -rw-rw-r-- 1 magnum magnum 97968 Sep 9 02:34 simd-intrinsics-bleeding.o -rw-rw-r-- 1 magnum magnum 98096 Sep 9 02:35 simd-intrinsics-switch.o -rw-rw-r-- 1 magnum magnum 98320 Sep 9 02:43 simd-intrinsics.o First is current bleeding, middle is with -O3 and switch (larger), last one is with -O2 and switch (even larger). Then again, how about -O2 and *not* using switch? Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 XOP 4x2]... (8xOMP) DONE Raw: 232960 c/s real, 29120 c/s virtual OK, that's inbetween. So before having tested any other format or arch, the pragma should go and the switch should be use for XOP too. Apparently Jim added that pragma in 2013 while (I think) adding SHA-2, likely because Gosney's original code had it. I will do some testing! magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.