Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 25 Aug 2015 18:00:03 +0200
From: magnum <>
Subject: Re: LWS and GWS auto-tuning

On 2015-08-25 13:42, Solar Designer wrote:
> The attached patch #if 0's opencl_find_best_workgroup() (perhaps we
> need to drop it completely, and remove from common-opencl.h too), and
> revises and makes use of opencl_find_best_lws().
> The new logic is, when neither GWS nor LWS env vars are specified:
> pre-tune GWS (with a lower than usual maximum), tune LWS, and finally
> tune GWS with the tuned LWS and considering the queried number of
> compute units.  Obviously, this is far from perfect - we're trying to
> find a maximum of a function of two variables, but are adjusting only
> one at a time.  Yet it appears to work much better than the current
> approach of tuning GWS only.

Aye sir. The boost from this is better than I thought. The difference is 
sometimes 2x and more!

Here's my laptop top/bottom 10:

Ratio:	0.83601 real, 0.40000 virtual	sha1crypt-opencl, (NetBSD):Raw
Ratio:	0.85271 real, 0.47251 virtual	encfs-opencl, EncFS:Raw
Ratio:	0.86628 real, 0.07211 virtual	wpapsk-opencl, WPA/WPA2 PSK:Raw
Ratio:	0.87242 real, 0.78965 virtual	Raw-SHA512-opencl:Raw
Ratio:	0.87999 real, 0.75624 virtual	Raw-SHA256-opencl:Raw
Ratio:	0.89727 real, 0.23077 virtual	krb5pa-sha1-opencl, Kerberos 5 
AS-REQ Pre-Auth etype 17/18:Raw
Ratio:	0.89920 real, 1.10118 virtual	mysql-sha1-opencl, MySQL 4.1+:Raw
Ratio:	0.94276 real, 0.06511 virtual	PBKDF2-HMAC-SHA1-opencl:Raw
Ratio:	0.94495 real, 0.98183 virtual	mscash-opencl, M$ Cache Hash:Only 
one salt
Ratio:	1.21359 real, 1.00000 virtual	md5crypt-opencl, crypt(3) $1$:Raw
Ratio:	1.27538 real, 2.72273 virtual	RAKP-opencl, IPMI 2.0 RAKP 
(RMCP+):Only one salt
Ratio:	1.28331 real, 0.50000 virtual	PBKDF2-HMAC-SHA512-opencl, GRUB2 / 
OS X 10.8+, rounds=10000:Raw
Ratio:	1.32295 real, 1.12515 virtual	zip-opencl, ZIP:Raw
Ratio:	1.33187 real, 8.22860 virtual	RAKP-opencl, IPMI 2.0 RAKP 
(RMCP+):Many salts
Ratio:	2.00808 real, 27.61671 virtual	krb5pa-md5-opencl, Kerberos 5 
AS-REQ Pre-Auth etype 23:Many salts
Ratio:	2.35028 real, 168.85266 virtual	oldoffice-opencl, MS Office <= 
2003:Many salts
Ratio:	2.51140 real, 31.75871 virtual	oldoffice-opencl, MS Office <= 
2003:Only one salt
Ratio:	2.54982 real, 4.06557 virtual	lotus5-opencl, Lotus Notes/Domino 5:Raw
Ratio:	2.55546 real, 23.30411 virtual	krb5pa-md5-opencl, Kerberos 5 
AS-REQ Pre-Auth etype 23:Only one salt

The worst ones I'm pretty sure are false/coincidental. I manually 
re-tested some of the best ones and it's true: They really auto-tune to 
2.5x faster speed than before.

I'm currently testing Tahiti/Titan on super.

On another note, AMD's 15.7 driver seem to have re-gained RAR and other 
formats' speed (even before the autotune changes). It was way slower for 
several versions of the driver (including 14.9 which is our best tested 
version ever).


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.