john-dev - Re: PHC: Argon2 on GPU

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKGDhHWv-u1YE695b4rd2GxEo6OhyF66fe8pMGTBdULrfWORXA@mail.gmail.com>
Date: Thu, 13 Aug 2015 09:52:34 +0200
From: Agnieszka Bielec <bielecagnieszka8@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: PHC: Argon2 on GPU

2015-08-13 0:28 GMT+02:00 magnum <john.magnum@...hmail.com>:
> On 2015-08-12 23:51, Solar Designer wrote:
>>
>> On Wed, Aug 12, 2015 at 11:45:35PM +0200, magnum wrote:
>>>
>>> On 2015-08-12 18:32, Agnieszka Bielec wrote:
>>>>
>>>> gws:      1024        3447 c/s        3447 rounds/s 297.022ms per
>>>> crypt_all()+
>>>> Local worksize (LWS) 64, global worksize (GWS) 1024
>>>> using different password for benchmarking
>>>> DONE
>>>> Speed for cost 1 (t) of 1, cost 2 (m) of 1500, cost 3 (l) of 1
>>>> Many salts:     2925 c/s real, 307200 c/s virtual
>>>> Only one salt:  2898 c/s real, 307200 c/s virtual
>>>
>>>
>>> The benchmark figures (last two lines) are the correct ones. If you set
>>> up auto-tune correctly, that speed should be similar to the benchmark.
>>> For some formats/situations this is hard to achieve and it's just
>>> cosmetic anyway.
>>
>>
>> magnum, do you have an explanation why the best benchmark result during
>> auto-tuning is usually substantially different from the final benchmark
>> in most of Agnieszka's formats?  I'm fine with eventually dismissing it
>> as "hard to achieve" and "cosmetic anyway", but I'd like to understand
>> the cause first.  Thanks!
>
>
> Generally a mismatch could be caused by using different [cost] test vectors
> in auto-tune than the ones benchmarked, or auto-tune using just one repeated
> plaintext in a format where length matters for speed (eg. RAR), or something
> along those lines.
>
> Another reason would be incorrect setup of autotune for split kernels. For
> example, if auto-tune thinks we're going to call a split kernel 500 times
> but the real run does it 1000 times, we'll see inflated figures from
> autotune.
>
> A third reason (seen in early WPA-PSK) is when crypt_all() does significant
> post-processing on CPU where auto-tune doesn't.

none of these I printfed plaintexts which are set during computation
of gws and modified benchc.c to set the same values and result is the
same

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.