Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 10 Jul 2012 06:35:36 +0400
From: Solar Designer <>
Subject: Re: Rotate and bitselect investigation

Sayantan, magnum -

On Tue, Jul 10, 2012 at 07:19:38AM +0530, Sayantan Datta wrote:
> On Tue, Jul 10, 2012 at 7:00 AM, Solar Designer <> wrote:
> > Maybe we should use the same approach that magnum uses in
> >
> > #ifdef cl_nv_pragma_unroll
> > #define NVIDIA
> > #endif
> > [...]
> > #ifdef NVIDIA
> > #define F(x,y,z)        (z ^ (x & (y ^ z)))
> > #else
> > #define F(x,y,z)        bitselect(z, y, x)
> > #endif
> >
> > This won't detect CPUs, though - where we also don't want to use
> > bitselect() most of the time (the instruction is only available with XOP
> > and is probably not used by current OpenCL SDKs since I think only
> > Intel's does vectorization) - but this code is mostly just for AMD and
> > NVIDIA GPUs now.  We have faster MSCash2 code on CPU anyway.

> So we should use manual bitselect by default.

But we don't have a trick similar to cl_nv_pragma_unroll that would let
us detect AMD GPUs.  So I am fine with us using bitselect() by default
and only disabling it on NVIDIA, unless/until we learn of a trick to
detect AMD GPU in OpenCL (or introduce such way by passing the info from
our C code).


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.