Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 10 Jul 2012 05:30:27 +0400
From: Solar Designer <>
Subject: Re: Rotate and bitselect investigation

On Mon, Jul 09, 2012 at 12:42:47PM +0530, Sayantan Datta wrote:
> On Mon, Jul 9, 2012 at 12:00 PM, Solar Designer <> wrote:
> > Also, I guess this change should hurt on NVIDIA (does
> > it?), so you'll need to wrap it in some #ifdef.
> Yes I did wrap it in #ifdef.

So you use USE_LIBRARY_BITSELECT, which has to be manually (un)defined.

Maybe we should use the same approach that magnum uses in

#ifdef cl_nv_pragma_unroll
#define NVIDIA
#ifdef NVIDIA
#define F(x,y,z)	(z ^ (x & (y ^ z)))
#define F(x,y,z)	bitselect(z, y, x)

This won't detect CPUs, though - where we also don't want to use
bitselect() most of the time (the instruction is only available with XOP
and is probably not used by current OpenCL SDKs since I think only
Intel's does vectorization) - but this code is mostly just for AMD and
NVIDIA GPUs now.  We have faster MSCash2 code on CPU anyway.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.