Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 22 Mar 2012 07:12:39 +0400
From: Solar Designer <>
Subject: Re: CUDA & OpenCL status

Lukas, Milen -

On Sun, Mar 04, 2012 at 08:37:58AM +0400, Solar Designer wrote:
> I've also tried:
> #define ROTATE_LEFT(x, s) rotate((x), (uint32_t)(s))
> which works, but does not obviously improve performance here.

This is puzzling.  Perhaps there's some other bottleneck that we're
bumping into, making further micro-optimizations irrelevant until we
deal with that other issue.

Apparently, rotate() should be as good as amd_bitalign() now, but we may
try the latter as well anyway:

Also relevant is this posting by Milen:

Milen - any additional info on that "fused SHL+ADD instruction" on
Nvidia and its use for MD5 and the like?  I don't immediately see how
such an instruction would be usable there because we actually need


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.