Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 28 Jan 2013 01:58:48 +0100
From: magnum <>
Subject: Re: Proposed optimizations to pwsafe

On 28 Jan, 2013, at 1:41 , Solar Designer <> wrote:
> On Sun, Jan 27, 2013 at 07:22:19PM -0500, Brian Wallace wrote:
>> Ok, I'll do those changes.  I haven't done much cuda/ocl coding in the
>> past, so it might take me a short while to get up to speed on what works
>> best, although I have a good background in C and hash cracking
>> optimization.  What kind of benchmarks are we getting on pwsafe-opencl vs
>> hashcat.
> Apparently, hashcat's speed is ~500k on HD 7970.  hashkill is at ~480k:
> We're getting 355k:

> (The match of OpenCL and CUDA speed is curious.  I did not tune THREADS
> and BLOCKS in cuda_pwsafe.h, and was compiling for the default of sm_10.
> Perhaps better speed is possible with some tuning.)

The OpenCL format currently only auto-tunes local work-size (THREADS) so it too runs at suboptimal conditions. The global work-size defauls to the same figure the CUDA format use. It does support LWS/GWS environment variables though:

$ GWS=$((256*1024)) ../run/john -t -fo:pwsafe-opencl -plat=1
OpenCL platform 1: AMD Accelerated Parallel Processing, 2 device(s).
Device 0: Tahiti (AMD Radeon HD 7900 Series)
Local worksize (LWS) 64, Global worksize (GWS) 262144
Benchmarking: Password Safe SHA-256 [OpenCL]... DONE
Raw:    362411 c/s real, 78643K c/s virtual

No huge difference though.

In bleeding, Claudio has added a shared function for tuning GWS. I haven't had time to try it out yet.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.