Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 03 May 2012 11:25:49 -0300
From: Claudio André <>
Subject: Re: cryptsha512-opencl

Em 03-05-2012 07:59, magnum escreveu:
> Claudio,
> Do you think you can get it working on AMD soon too?
I saw a really crazy thing in 7970. For example:

     The code:
         output->v[3] = (i < rounds);
         output->v[4] = (i >= rounds);
         output->v[5] = rounds;
         output->v[6] = i;

     The output (after o loop, so "i" should be 4999 [decimal]):
Tahiti: 0000000000000000 0000000000000001 0000000000001388 
0000000000000001 <==> (1 >= 5000)???????
CPU:    0000000000000001 0000000000000000 0000000000001388 0000000000001387

Trying some other tests, i found a condition where:
(i < rounds) == false AND (i >= rounds) == false

My understanding is that ((int + uint) + (maybe) local memory access) do 
the trick. Not sure.

I don't want to create a kernel only to succeed on this crazy 
environment. Trying to find a pattern.
> We might try to
> wrap up a Jumbo release soon. Also (or at the same time but this is
> lower prio) you might want to get rid of byte stores. The AMD will honor
> this setting:
> #ifdef cl_khr_byte_addressable_store
> #pragma OPENCL EXTENSION cl_khr_byte_addressable_store : disable
> #endif
> This will make your format fail until you have replaced all byte pointer
> writes to macros like the PUTCHAR one in rar (it's normally as simple as
> that). Note that besides making your format work on more devices, this
> will probably also speed things up.
> BTW, that pragma disable *should* work on nvidia too but they just
> silently ignore it.
> magnum

Will do it.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.