Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 27 Aug 2015 18:51:20 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: LWS and GWS auto-tuning

On 2015-08-27 17:53, magnum wrote:
> After some (well a lot) regression testing with nvidia and Intel CPU,
> some issues where addressed. cf6f3457f seems to be a decent version. Now
> another round of regression testing on AMD and others  o.O

That new CPU driver in well looks good. Here's our CPU format with WPAPSK:

Benchmarking: wpapsk, WPA/WPA2 PSK [PBKDF2-SHA1 256/256 AVX2 8x]... 
(8xOMP) DONE
Raw:	13116 c/s real, 1649 c/s virtual

Here's OpenCL -dev=0

$ ../run/john -test -form=wpapsk-opencl -dev=0
Device 0: Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz
Benchmarking: wpapsk-opencl, WPA/WPA2 PSK [PBKDF2-SHA1 OpenCL]... DONE
Raw:	13540 c/s real, 1699 c/s virtual

The device asked for scalar code so we gave it that, then it was 
auto-vectorized and actually faster than our intrinsics.

Here's forcing 8x vector source code:

$ ../run/john -test -form=wpapsk-opencl -dev=0 -force-vector=8
Device 0: Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz
Benchmarking: wpapsk-opencl, WPA/WPA2 PSK [PBKDF2-SHA1 OpenCL 8x]... DONE
Raw:	13320 c/s real, 1668 c/s virtual

Slightly slower than auto-vectorized in this case, but still faster than 
our intrinsics :)

magnum

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ