Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 7 Nov 2012 18:12:24 +0100
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: Split kernel for OpenCL WPA-PSK

On 7 Nov, 2012, at 17:43 , Lukas Odzioba <lukas.odzioba@...il.com> wrote:
>> For some reason it segfaults on the Tahiti (but not on AMDAPP/CPU). On all other devices I've tried it works fine. If we can get this straight we should implement similar changes to a bunch of other OpenCL formats that use PBKDF2-HMAC-SHA1.
> 
> On Tahiti it segfaults during selftest, testsuite or some real world cracking?

It segfaults during self-test. The debugger ends up within the amdocl drivers. I suspect it's yet another driver bug. But I have tried it with 12.8 too and it did not help.

>> These are massive changes to both host code and kernel. Some 15-20% boost is gained too btw, and device auto-tuning is implemented.
> 
> 15-20% just for nvidia or for amd too?

I have no idea because of the segfaults but I implemented the usual bitselects, rotate and stuff so if anything, it should be faster. Also, the split kernels reduces register pressure. Some other changes I made released even more registers. On the other hand we depend on global memory between the loop calls. That does not stop office2007 from doing 2.1 billion SHA1/second though, although that one has smaller global memory footprint than this one.

I'm planning to implement selective vectorizing too, like in my other recent formats. At least on CPU it should be a good thing. Perhaps on VLIW too.

magnum

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ