Date: Mon, 28 Jan 2013 16:26:04 -0500 From: Brian Wallace <nightstrike9809@...il.com> To: john-dev@...ts.openwall.com Subject: Re: Proposed optimizations to pwsafe I'm going to try and replace ror with rotate calls, but it seems to require some type conversions. I'm doing a bit of reading up on OpenCL dev to fix any issues and hopefully get more c/s. On Mon, Jan 28, 2013 at 1:55 PM, magnum <john.magnum@...hmail.com> wrote: > Brian, > > After your OpenCL patch I get these warnings from pwsafe-opencl: > > Build log: <program source>:282:36: warning: signed shift result > (0x200000000) requires 35 bits to represent, but 'int' only has 32 bits > w = sigma1( w ) + w + sigma0( 256 ); > ^~~~~~~~~~~~~ > <program source>:21:21: note: expanded from macro 'sigma0' > #define sigma0(x) ((ror(x,7)) ^ (ror(x,18)) ^ (x>>3)) > ^ > <program source>:16:33: note: expanded from macro 'ror' > #define ror(x,n) ((x >> n) | (x << (32-n))) > ~ ^ ~ > <program source>:615:35: warning: signed shift result (0x200000000) > requires 35 bits to represent, but 'int' only has 32 bits > w = sigma1( w ) + w + sigma0( 256 ); > ^~~~~~~~~~~~~ > <program source>:21:21: note: expanded from macro 'sigma0' > #define sigma0(x) ((ror(x,7)) ^ (ror(x,18)) ^ (x>>3)) > ^ > <program source>:16:33: note: expanded from macro 'ror' > #define ror(x,n) ((x >> n) | (x << (32-n))) > ~ ^ ~ > > > It passes self-test though. Even the Test Suite passes IIRC. So maybe this > is harmless? But we should still get rid of the warnings. > > Note that in the bleeding branch, compiler warnings are always shown. In > unstable, you need to -DREPORT_OPENCL_WARNINGS or -DDEBUG for them to show > up (as long as there are only warnings). > > magnum > > > > On 28 Jan, 2013, at 2:09 , Brian Wallace <nightstrike9809@...il.com> > wrote: > > When I applied the opencl optimization, I only saw minor improvements > compared to the CUDA improvements. I found that was kind of weird, because > it was basically the same changes to the code. > > On Sun, Jan 27, 2013 at 7:58 PM, magnum <john.magnum@...hmail.com> wrote: > >> On 28 Jan, 2013, at 1:41 , Solar Designer <solar@...nwall.com> wrote: >> > On Sun, Jan 27, 2013 at 07:22:19PM -0500, Brian Wallace wrote: >> >> Ok, I'll do those changes. I haven't done much cuda/ocl coding in the >> >> past, so it might take me a short while to get up to speed on what >> works >> >> best, although I have a good background in C and hash cracking >> >> optimization. What kind of benchmarks are we getting on pwsafe-opencl >> vs >> >> hashcat. >> > >> > Apparently, hashcat's speed is ~500k on HD 7970. hashkill is at ~480k: >> > >> > http://twitter.com/gat3way/status/294968226209726464/photo/1 >> > >> > We're getting 355k: >> > >> >> > (The match of OpenCL and CUDA speed is curious. I did not tune THREADS >> > and BLOCKS in cuda_pwsafe.h, and was compiling for the default of sm_10. >> > Perhaps better speed is possible with some tuning.) >> >> The OpenCL format currently only auto-tunes local work-size (THREADS) so >> it too runs at suboptimal conditions. The global work-size defauls to the >> same figure the CUDA format use. It does support LWS/GWS environment >> variables though: >> >> $ GWS=$((256*1024)) ../run/john -t -fo:pwsafe-opencl -plat=1 >> OpenCL platform 1: AMD Accelerated Parallel Processing, 2 device(s). >> Device 0: Tahiti (AMD Radeon HD 7900 Series) >> Local worksize (LWS) 64, Global worksize (GWS) 262144 >> Benchmarking: Password Safe SHA-256 [OpenCL]... DONE >> Raw: 362411 c/s real, 78643K c/s virtual >> >> No huge difference though. >> >> In bleeding, Claudio has added a shared function for tuning GWS. I >> haven't had time to try it out yet. >> >> magnum >> > > > Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.