Date: Thu, 13 Aug 2015 11:23:05 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Autotune speed figures (was: Re: PHC: Argon2 on GPU) On 2015-08-13 09:52, Agnieszka Bielec wrote: > 2015-08-13 0:28 GMT+02:00 magnum <john.magnum@...hmail.com>: >> On 2015-08-12 23:51, Solar Designer wrote: >>> magnum, do you have an explanation why the best benchmark result during >>> auto-tuning is usually substantially different from the final benchmark >>> in most of Agnieszka's formats? I'm fine with eventually dismissing it >>> as "hard to achieve" and "cosmetic anyway", but I'd like to understand >>> the cause first. Thanks! >> >> >> Generally a mismatch could be caused by using different [cost] test vectors >> in auto-tune than the ones benchmarked, or auto-tune using just one repeated >> plaintext in a format where length matters for speed (eg. RAR), or something >> along those lines. >> >> Another reason would be incorrect setup of autotune for split kernels. For >> example, if auto-tune thinks we're going to call a split kernel 500 times >> but the real run does it 1000 times, we'll see inflated figures from >> autotune. >> >> A third reason (seen in early WPA-PSK) is when crypt_all() does significant >> post-processing on CPU where auto-tune doesn't. > > none of these I printfed plaintexts which are set during computation > of gws and modified benchc.c to set the same values and result is the > same > Then you might want to dig into it. The autotune code should be easy to follow. Try to establish exactly what it comes up with and how it ends up with the figures it prints for your format. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.