Date: Wed, 19 Aug 2015 19:39:09 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: PHC: Argon2 on GPU Agnieszka, As it has just been mentioned on the PHC list, you need to try exploiting the parallelism inside ComputeBlock. There are two groups of 8 BLAKE2 rounds. In each of the groups, the 8 rounds may be computed in parallel. When your kernel is working on ulong2, I think it won't fully exploit this parallelism, except that the parallelism may allow for better pipelining within those ulong2 lanes (not stalling further instructions since their input data is separate and thus is readily available). I think you may try working on ulong16 or ulong8 instead. I expect ulong8 to match the current GPU hardware best, but OTOH ulong16 makes more parallelism apparent to the OpenCL compiler and allocates it to one work-item. So please try both and see which works best. With this, you'd launch groups of 8 or 4 BLAKE2 rounds on those wider vectors, and then between the two groups of 8 in ComputeBlock you'd need to shuffle vector elements (moving them between two vectors of ulong8 if you use that type) instead of shuffling state elements like you do now (and like the original Argon2 code did). The expectation is that a single kernel invocation will then make use of more SIMD width (2x512- or 512-bit instead of the current 128-bit), yet only the same amount of local and private memory as it does now. So you'd pack as many of these kernels per GPU as you do now, but they will run faster (up to 8x faster) since they'd process 8 or 4 BLAKE2 rounds in parallel rather than sequentially. Of course, once you've sped this up, other parts of code may become the new bottlenecks. In particular, the modulo operation may become even more important to optimize as well. You can, and should, quickly test whether or not it is a bottleneck for a given kernel on a given GPU by replacing it with that wrap() function I posted. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.