Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 22 Mar 2012 07:25:25 -0300
From: Claudio André <>
Subject: Re: CUDA & OpenCL status

I'm using rotate now, no performance gain too.


Em 22-03-2012 00:12, Solar Designer escreveu:
> Lukas, Milen -
> On Sun, Mar 04, 2012 at 08:37:58AM +0400, Solar Designer wrote:
>> I've also tried:
>> #define ROTATE_LEFT(x, s) rotate((x), (uint32_t)(s))
>> which works, but does not obviously improve performance here.
> This is puzzling.  Perhaps there's some other bottleneck that we're
> bumping into, making further micro-optimizations irrelevant until we
> deal with that other issue.
> Apparently, rotate() should be as good as amd_bitalign() now, but we may
> try the latter as well anyway:
> Also relevant is this posting by Milen:
> Milen - any additional info on that "fused SHL+ADD instruction" on
> Nvidia and its use for MD5 and the like?  I don't immediately see how
> such an instruction would be usable there because we actually need
> rotate+ADD.
> Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.