Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 21 Apr 2012 00:45:56 +0200
From: magnum <>
Subject: Re: cl_khr_byte_addressable_store

Then I'm afraid you lost me. Just how should I approach this? Should I
do two separate kernels or should I try some kind of bit-flipping
madness that just might work on both AMD and nvidia?


On 04/21/2012 12:23 AM, Milen Rangelov wrote:
> No. accessing uchar4 arrays would generate compiler error if you're not
> using the extension, eg __local uchar4 arr[4];arr[1]=(1,2,3,4) would not
> compile without the extension. Otherwise I believe you can have __private
> uchar4 non-array variables and access them.  But for RAR kernel you'd have
> to use an ucharN array anyway.
> On Sat, Apr 21, 2012 at 12:34 AM, magnum <> wrote:
>> On 04/20/2012 09:59 PM, Milen Rangelov wrote:
>>> Well especially for RAR on AMD, I had several attempts around that idea
>> and
>>> they ended much slower than the vectorized, bitwise magic version. But
>> you
>>> should leave it just because 4xxx is not supported. I know sometimes it's
>>> hard and it could get VERY UGLY (my rar kernel is frightening). Nvidia
>> may
>>> have no problems with it, but AMD is not the case..
>> Just to get things straight in my sore head: If I vectorize the lot and
>> use uchar4, I do not need byte_addressable_store, is that right?
>> magnum

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.