Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 31 Mar 2015 04:10:04 -0700
From: <epixoip@...dshell.nl>
To: <john-dev@...ts.openwall.com>
Subject: RE: [GSoC] John the Ripper support for PHC finalists

> -----Original Message-----
> From: Jeremi M Gosney
> Sent: Tuesday, March 31, 2015 03:46
> 
> > -----Original Message-----
> > From: magnum
> > Sent: Monday, March 30, 2015 02:51
> >
> > On 2015-03-30 11:11, magnum wrote:
> > > Also, your auto-tune settings are totally wrong, possibly ending up in
> > > suboptimal work sizes. I will fix them and submit a patch.
> >
> > This patch fixes the auto-tune problems. BTW with my other suggestions
> > it also runs fine on the GTX Titan, at about same ~10K c/s speed as on
> > HD7970. And now it also runs on my Macbook GPU (1290 c/s).
> 
> 
> I had to change "char build_opts[64]" to "char build_opts[64] = {0};" in init() in
> order to get this to run, otherwise it would fail with the following error:
> 
> OpenCL error (CL_INVALID_DEVICE) in file (common-opencl.c) at line (969) -
> (Error while getting build info I)
> 
> For some reason I'm seeing half the speeds you're seeing on 290X.
> 
> epixoip@...en:~/pomelo/JohnTheRipper/run$ ./john -te -form:pomelo-
> opencl
> Device 0: Hawaii [AMD Radeon R9 200 Series]
> Local worksize (LWS) 64, global worksize (GWS) 1024
> Benchmarking: pomelo-opencl, POMELO [POMELO OpenCL (inefficient,
> development use only)]... DONE
> Raw:    5535 c/s real, 614400 c/s virtual


(Sorry, I think I sent the previous email from an unsubscribed address.)

This looks to be an autotune issue. With some manual testing, LWS 17 GWS 1020 seems to be ideal for 290X.

epixoip@...en:~/pomelo/JohnTheRipper/run$ LWS=17 ./john -te -form:pomelo-opencl
Device 0: Hawaii [AMD Radeon R9 200 Series]
Local worksize (LWS) 17, global worksize (GWS) 1020
Benchmarking: pomelo-opencl, POMELO [POMELO OpenCL (inefficient, development use only)]... DONE
Raw:    12750 c/s real, 663000 c/s virtual

So that's about 4x faster than with the autotune settings, and about 3x slower than my CPU:

epixoip@...en:~/pomelo/JohnTheRipper/run$ OMP_NUM_THREADS=6 ./john -te -form:pomelo
Will run 6 OpenMP threads
Benchmarking: pomelo, Generic pomelo [Pomelo]... (6xOMP) DONE
Many salts:     40128 c/s real, 6688 c/s virtual
Only one salt:  39825 c/s real, 6704 c/s virtual


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.