Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 22 Mar 2012 07:12:39 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: CUDA & OpenCL status

Lukas, Milen -

On Sun, Mar 04, 2012 at 08:37:58AM +0400, Solar Designer wrote:
> I've also tried:
> 
> #define ROTATE_LEFT(x, s) rotate((x), (uint32_t)(s))
> 
> which works, but does not obviously improve performance here.

This is puzzling.  Perhaps there's some other bottleneck that we're
bumping into, making further micro-optimizations irrelevant until we
deal with that other issue.

Apparently, rotate() should be as good as amd_bitalign() now, but we may
try the latter as well anyway:

http://devgurus.amd.com/thread/158497

Also relevant is this posting by Milen:

http://www.openwall.com/lists/john-users/2011/02/01/2

Milen - any additional info on that "fused SHL+ADD instruction" on
Nvidia and its use for MD5 and the like?  I don't immediately see how
such an instruction would be usable there because we actually need
rotate+ADD.

Alexander

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ