|
Date: Sat, 10 Oct 2015 08:35:43 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Cc: Myrice <qqlddg@...il.com> Subject: Re: 64-bit rotate on AMD GCN On Sat, Oct 10, 2015 at 07:52:06AM +0300, Solar Designer wrote: > #define ror(x, n) ((n) < 32 ? (amd_bitalign((uint)((x) >> 32), (uint)(x), (uint)(n)) | ((ulong)amd_bitalign((uint)(x), (uint)((x) >> 32), (uint)(n)) << 32)) : (amd_bitalign((uint)(x), (uint)((x) >> 32), (uint)(n) - 32) | ((ulong)amd_bitalign((uint)((x) >> 32), (uint)(x), (uint)(n) - 32) << 32))) I've just tried introducing the above revision of ror() into myrice's xsha512_kernel.cl, which previously used rotate(), and speed went from: [solar@...er run]$ AMD_OCL_BUILD_OPTIONS_APPEND=-save-temps ./john -test=10 -form=xsha512-opencl -dev=2 -v=4 [...] Local worksize (LWS) 128, global worksize (GWS) 8388608 DONE Many salts: 278223K c/s real, 4973M c/s virtual Only one salt: 56310K c/s real, 72389K c/s virtual to: [solar@...er run]$ AMD_OCL_BUILD_OPTIONS_APPEND=-save-temps ./john -test=10 -form=xsha512-opencl -dev=2 -v=4 [...] Local worksize (LWS) 128, global worksize (GWS) 8388608 DONE Many salts: 345265K c/s real, 5082M c/s virtual Only one salt: 58486K c/s real, 72315K c/s virtual So we should expect 300M+ c/s for raw-sha512 as well (this is also seen as e.g. "310395000 rounds/s" during auto-tuning for sha512crypt), with on-GPU mask and hash comparisons when we have those implemented for this hash type efficiently (I think not yet?) Similarly to sha512crypt, IL size went way up, and ISA size slightly down: [solar@...er run]$ ls -l a b a: total 840 -rw-------. 1 solar solar 5733 Oct 10 08:13 _temp_0_Tahiti.cl -rw-------. 1 solar solar 6838 Oct 10 08:13 _temp_0_Tahiti.i -rw-------. 1 solar solar 163501 Oct 10 08:13 _temp_0_Tahiti.il -rw-------. 1 solar solar 3445 Oct 10 08:13 _temp_0_Tahiti_kernel_cmp.il -rw-------. 1 solar solar 5443 Oct 10 08:13 _temp_0_Tahiti_kernel_cmp.isa -rw-------. 1 solar solar 159665 Oct 10 08:13 _temp_0_Tahiti_kernel_xsha512.il -rw-------. 1 solar solar 506243 Oct 10 08:13 _temp_0_Tahiti_kernel_xsha512.isa b: total 984 -rw-------. 1 solar solar 6038 Oct 10 08:17 _temp_0_Tahiti.cl -rw-------. 1 solar solar 11310 Oct 10 08:17 _temp_0_Tahiti.i -rw-------. 1 solar solar 261999 Oct 10 08:17 _temp_0_Tahiti.il -rw-------. 1 solar solar 3445 Oct 10 08:17 _temp_0_Tahiti_kernel_cmp.il -rw-------. 1 solar solar 5443 Oct 10 08:17 _temp_0_Tahiti_kernel_cmp.isa -rw-------. 1 solar solar 258163 Oct 10 08:17 _temp_0_Tahiti_kernel_xsha512.il -rw-------. 1 solar solar 450253 Oct 10 08:17 _temp_0_Tahiti_kernel_xsha512.isa [solar@...er run]$ fgrep codeLenInByte [ab]/*.isa a/_temp_0_Tahiti_kernel_cmp.isa:codeLenInByte = 172 bytes; a/_temp_0_Tahiti_kernel_xsha512.isa:codeLenInByte = 30432 bytes; b/_temp_0_Tahiti_kernel_cmp.isa:codeLenInByte = 172 bytes; b/_temp_0_Tahiti_kernel_xsha512.isa:codeLenInByte = 30140 bytes; [solar@...er run]$ fgrep NumVgpr [ab]/*.isa a/_temp_0_Tahiti_kernel_cmp.isa:NumVgprs = 3; a/_temp_0_Tahiti_kernel_xsha512.isa:NumVgprs = 98; b/_temp_0_Tahiti_kernel_cmp.isa:NumVgprs = 3; b/_temp_0_Tahiti_kernel_xsha512.isa:NumVgprs = 106; On a related note, sha512crypt-opencl is now almost same speed as sha256crypt-opencl on GCN, meaning that there must be lots of room for improvement in the latter. sha512crypt-opencl: gws: 262144 62827 314135000 rounds/s 4.172s per crypt_all()+ Local worksize (LWS) 256, global worksize (GWS) 262144 DONE Speed for cost 1 (iteration count) of 5000 Raw: 55072 c/s real, 2383K c/s virtual sha256crypt-opencl: gws: 1048576 64687 323435000 rounds/s 16.209s per crypt_all()+ Local worksize (LWS) 64, global worksize (GWS) 1048576 DONE Speed for cost 1 (iteration count) of 5000 Raw: 68089 c/s real, 5518K c/s virtual Curiously, there's little speed difference between these two seen on auto-tuning (e.g. 62827 vs. 64687 on final and best lines here), but more of a difference on final benchmark results. Also, optimal GWS of 1048576 is very high for a slow hash. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.