Date: Wed, 12 Aug 2015 15:58:57 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: PHC: Argon2 on CPU Hi Agnieszka, On Wed, Aug 12, 2015 at 01:51:08PM +0200, Agnieszka Bielec wrote: > 2015-08-06 16:02 GMT+02:00 Solar Designer <solar@...nwall.com>: > > On Sun, Aug 02, 2015 at 10:46:00PM +0200, Agnieszka Bielec wrote: > >> OPT > >> > >> none@...e ~/Desktop/rr/run $ ./john --test --format=argon2i > >> Will run 8 OpenMP threads > >> Benchmarking: argon2i [AVX]... (8xOMP) > >> memory per hash : 100.00 kB > >> using different password for benchmarking > >> DONE > >> Speed for cost 1 (t) of 3, cost 2 (m) of 100 > >> Raw: 24064 c/s real, 3019 c/s virtual > >> > >> none@...e ~/Desktop/rr/run $ ./john --test --format=argon2d > >> Will run 8 OpenMP threads > >> Benchmarking: argon2d [AVX]... (8xOMP) > >> memory per hash : 100.00 kB > >> using different password for benchmarking > >> DONE > >> Speed for cost 1 (t) of 3, cost 2 (m) of 100 > >> Raw: 27008 c/s real, 3418 c/s virtual > > > > Nice speeds for presumably SIMD-less code > > previously SIMD was 3x faster and this was suspicious for me I'm not sure I understand what you're trying to say correctly. Are you trying to say the 3x difference was suspicious to you, and the much smaller difference you've since obtained (how?) is not? If so, I disagree with you: the small difference is more suspicious, because these benchmarks are at a small memory size (100 KB), so should fit in L2 cache, and performance should be dominated by that of the BLAKE2 code. There's more than sufficient parallelism in Argon2 to fully exploit SIMD. > > but please note that both of > > your benchmarks above (REF and OPT) say AVX. Are they lying? > > I changed only > > #ifdef __SSE2__ > ARGON2i_SSE > #else > ARGON2i > #endif > (out, outlen, in, inlen, salt, saltlen, t_cost, m_cost, lanes, > memory->aligned); > > to > > #ifdef __SSE2__ > ARGON2i > #else > ARGON2i > #endif > (out, outlen, in, inlen, salt, saltlen, t_cost, m_cost, lanes, > memory->aligned); > > and some indef's / undefs in another files I asked you one thing, you answered another. I can't make sense of this. Once again: in the benchmarks you posted, all comments say "AVX". Are they wrong, and in what way? As to you the ARGON2i_SSE to ARGON2i change inside #ifdef __SSE2__ above, is this your answer to my "how?" question above (on how you obtained the nearly-SIMD performance of presumably non-SIMD code)? That's weird if so. > >> but I was testing these no-sse versions by modyfiyng my code, don't > >> know if I can just turn-off simd (?), so I can't be sure of these > >> results although I know that structure of REF is different than > >> OPT-SSE one(maybe more) function was called a different number of time > > > > I'm sorry, but I find your wording above confusing. So let me try to > > ask a clarifying question: > > > > Are you reviewing the generated assembly code? It's trivial to see if > > the code is using SIMD or not. > > > > And while we're at it: > > > > How are you obtaining the assembly code for review? Do you replace > > gcc's "-c" option with "-S"? Or do you use "objdump -d" on the .o file? > > I revieved using objdump and I use only objdump > files argon2d_sse_plug.o argon2i_sse_plug.o blake2b_plug.o are empty > argon2d_plug.o argon2i_plug.o blake2b-ref_plug.o doesn't contain simd code > I hope I checked all necessary files OK, this does suggest you've checked a non-SIMD build. Is this the one you reported as "OPT" above? If so, should you remove the "AVX" comment from it? In general, a common theme in your benchmark postings to john-dev is comments inconsistent with what's actually benchmarked. I'd appreciate it if you spend the extra minute each time you make code changes to set the comments printed by the code to match the actual code. Thanks, Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.