Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 16 Aug 2015 15:03:56 +0200
From: Agnieszka Bielec <bielecagnieszka8@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: PHC: Argon2 on GPU

2015-08-16 14:48 GMT+02:00 Solar Designer <solar@...nwall.com>:
> On Sun, Aug 16, 2015 at 02:01:38PM +0200, Agnieszka Bielec wrote:
>> now I was digging in argon2d ( I discovored that this bug occurs after
>> commit 9e96f452350c0f2cae32b38e4a4cd1f83d51a367)
>> and before this commit was code:
>>
>> bi = prev_block_offset = ((prev_slice * lanes + pos.lane + 1) *
>> segment_length - 1) * BLOCK_SIZE;
>> for (i = 0; i < 64; i++)
>> {
>>        prev_block[i] = *(__global ulong2 *) (&memory[bi]);
>>        bi += 16;
>> }
>>
>> slowdown on AMD occurs when I changed this code to:
>>
>> bi = prev_block_offset = ((prev_slice * lanes + pos.lane + 1) *
>> segment_length - 1) * BLOCK_SIZE / 16;
>> for (i = 0; i < 64; i++)
>> {
>>         prev_block[i] = ((__global ulong2*)memory)[bi+i];
>> }
>>
>> see anyone some logic here or is this just a bug on AMD?
>
> Why do you call this a bug?  It isn't necessarily a bug when performance
> of code changes when you change the source code.
>
> Anyway, it looks like in the second code version you rely on address
> scaling by 16, and this is probably not available in the architecture
> (usually available is scaling by up to 8), so requires extra
> instructions (explicit left shifts).

where do you see address scaling? bi is uint and even before /16 is
BLOCK_SIZE which is much bigger than 16 and divisible by 16 so
preprocessor will change this to *[single value]

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.