Date: Sat, 5 Sep 2015 09:09:12 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: MD4 G() magnum, Sayantan - MD4 G() is the same as SHA-2 Maj(), yet we've been using unoptimized expression for it so far. The attached patch improves the speed for pbkdf2-hmac-md4-opencl on Tahiti from: Local worksize (LWS) 64, global worksize (GWS) 524288 DONE Speed for cost 1 (iterations) of 1000 Raw: 3994K c/s real, 104857K c/s virtual to: Local worksize (LWS) 64, global worksize (GWS) 524288 DONE Speed for cost 1 (iterations) of 1000 Raw: 4537K c/s real, 94371K c/s virtual or if I let it auto-tune to higher GWS (which it previously would not): Local worksize (LWS) 64, global worksize (GWS) 2097152 DONE Speed for cost 1 (iterations) of 1000 Raw: 4592K c/s real, 125829K c/s virtual On one core in FX-8120, I got improvement (with the previously posted patch) from: Benchmarking: Raw-MD4 [MD4 128/128 XOP 4x2]... DONE Raw: 36863K c/s real, 36863K c/s virtual to: Benchmarking: Raw-MD4 [MD4 128/128 XOP 4x2]... DONE Raw: 39233K c/s real, 39233K c/s virtual although some of the speedup, namely to: Benchmarking: Raw-MD4 [MD4 128/128 XOP 4x2]... DONE Raw: 37509K c/s real, 37509K c/s virtual came from enabling use of H2, which was previously disabled for 2x interleaving. The new speed of 39233K is finally better than raw-md5's, which is at most (over several benchmark invocations): Benchmarking: Raw-MD5 [MD5 128/128 XOP 4x2]... DONE Raw: 37918K c/s real, 37918K c/s virtual Yet the difference is surprisingly small, suggesting that there's still room for speeding up our MD4 on CPU. It may be worth experimenting with different orderings of x, y, z to G(). Maybe some of the 6 will result in lower optimal GWS or/and better performance than others. (The same applies to SHA-1 and SHA-2.) nt_kernel.cl and mscash_kernel.cl (any others?) will need separate patches. mscash_kernel.cl doesn't even use bitselect() for F(), and doesn't use rotate(). They should be made to use opencl_md4.h macros. Alexander View attachment "john-opencl-md4g.diff" of type "text/plain" (817 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.