|
|
Message-ID: <20130203065408.GA24719@openwall.com>
Date: Sun, 3 Feb 2013 10:54:08 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: Proposed optimizations to pwsafe
Brian, magnum -
On Wed, Jan 30, 2013 at 05:46:01AM -0500, Brian Wallace wrote:
> Device 1: Tahiti (AMD Radeon HD 7900 Series)
> Local worksize (LWS) 64, Global worksize (GWS) 57344
> Benchmarking: Password Safe SHA-256 [OpenCL]... DONE
> Raw: 472615 c/s real, 17203K c/s virtual
Now getting:
Device 1: Tahiti (AMD Radeon HD 7900 Series)
Local worksize (LWS) 64, Global worksize (GWS) 1048576
Benchmarking: Password Safe SHA-256 [OpenCL]... DONE
Raw: 498135 c/s real, 209715K c/s virtual
> Benchmarking: Password Safe SHA-256 [CUDA]... DONE
> Raw: 129590 c/s real, 128862 c/s virtual
Device 0: GeForce GTX 570
Local worksize (LWS) 64, Global worksize (GWS) 131072
Benchmarking: Password Safe SHA-256 [OpenCL]... DONE
Raw: 131510 c/s real, 131951 c/s virtual
Benchmarking: Password Safe SHA-256 [CUDA]... DONE
Raw: 129590 c/s real, 129590 c/s virtual
We got useful test results from atom (thanks again!):
http://pastebin.com/xCaeqBKY
Most useful is the reminder that we need to use split kernel (OpenCL
only, since only relevant for AMD GPUs/drivers):
"- Had to use LWS=64 because LWS=256 created a Zombie and I was forced
to reboot :("
(I guess this could actually be a random occurrence. The problem could
also occur with LWS=64.)
magnum - Brian is going to implement split kernel, please help him by
answering any questions he might have, etc.
Brian - basically, individual kernel invocations should be taking no
more than 200ms, preferably much less. This means that with a large
GWS, you need to be computing only a fraction of the 2048 iterations per
kernel invocation. Please store intermediate results in global memory.
Thanks all!
Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.