Date: Mon, 28 Jan 2013 01:58:48 +0100 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: Proposed optimizations to pwsafe On 28 Jan, 2013, at 1:41 , Solar Designer <solar@...nwall.com> wrote: > On Sun, Jan 27, 2013 at 07:22:19PM -0500, Brian Wallace wrote: >> Ok, I'll do those changes. I haven't done much cuda/ocl coding in the >> past, so it might take me a short while to get up to speed on what works >> best, although I have a good background in C and hash cracking >> optimization. What kind of benchmarks are we getting on pwsafe-opencl vs >> hashcat. > > Apparently, hashcat's speed is ~500k on HD 7970. hashkill is at ~480k: > > http://twitter.com/gat3way/status/294968226209726464/photo/1 > > We're getting 355k: > > (The match of OpenCL and CUDA speed is curious. I did not tune THREADS > and BLOCKS in cuda_pwsafe.h, and was compiling for the default of sm_10. > Perhaps better speed is possible with some tuning.) The OpenCL format currently only auto-tunes local work-size (THREADS) so it too runs at suboptimal conditions. The global work-size defauls to the same figure the CUDA format use. It does support LWS/GWS environment variables though: $ GWS=$((256*1024)) ../run/john -t -fo:pwsafe-opencl -plat=1 OpenCL platform 1: AMD Accelerated Parallel Processing, 2 device(s). Device 0: Tahiti (AMD Radeon HD 7900 Series) Local worksize (LWS) 64, Global worksize (GWS) 262144 Benchmarking: Password Safe SHA-256 [OpenCL]... DONE Raw: 362411 c/s real, 78643K c/s virtual No huge difference though. In bleeding, Claudio has added a shared function for tuning GWS. I haven't had time to try it out yet. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.