[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 22 Mar 2012 07:12:39 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: CUDA & OpenCL status
Lukas, Milen -
On Sun, Mar 04, 2012 at 08:37:58AM +0400, Solar Designer wrote:
> I've also tried:
>
> #define ROTATE_LEFT(x, s) rotate((x), (uint32_t)(s))
>
> which works, but does not obviously improve performance here.
This is puzzling. Perhaps there's some other bottleneck that we're
bumping into, making further micro-optimizations irrelevant until we
deal with that other issue.
Apparently, rotate() should be as good as amd_bitalign() now, but we may
try the latter as well anyway:
http://devgurus.amd.com/thread/158497
Also relevant is this posting by Milen:
http://www.openwall.com/lists/john-users/2011/02/01/2
Milen - any additional info on that "fused SHL+ADD instruction" on
Nvidia and its use for MD5 and the like? I don't immediately see how
such an instruction would be usable there because we actually need
rotate+ADD.
Alexander
Powered by blists - more mailing lists
Powered by Openwall GNU/*/Linux -
Powered by OpenVZ