john-dev - Re: cl_khr_byte_addressable

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120420230321.GA29975@openwall.com>
Date: Sat, 21 Apr 2012 03:03:21 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: cl_khr_byte_addressable_store

On Sat, Apr 21, 2012 at 12:45:56AM +0200, magnum wrote:
> Then I'm afraid you lost me. Just how should I approach this? Should I
> do two separate kernels or should I try some kind of bit-flipping
> madness that just might work on both AMD and nvidia?

I can't speak for Milen, but I guess that to write a byte you need to
read a naturally aligned 4-byte word, mask out the original byte in it,
OR in your new byte value, and write that word back.  Of course, this is
non-atomic, but you should not be accessing nearby bytes from another
thread anyway.

An obvious optimization would be to combine multiple byte writes
together such that you read/write fewer words (such as one per 4 bytes).

Alexander

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.