Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 31 Aug 2015 11:08:18 +0300
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: PHC: Argon2 on GPU

On Sun, Aug 30, 2015 at 01:44:32AM +0200, Agnieszka Bielec wrote:
> 2015-08-29 8:48 GMT+02:00 Solar Designer <solar@...nwall.com>:
> > As to loop unrolling, there's "#pragma unroll N", and when you specify
> > N=1 so "#pragma unroll 1" I think it prevents unrolling.  As an
> > experiment, I tried adding "#pragma unroll 1" before all loops in
> > argon2d_kernel.cl, and the PTX instruction count reduced - but not a
> > lot.
> 
> Can I get this code?

Attached, although this is just an experiment.  I think at least the
loops in ComputeBlock_pgg() actually need to stay unrolled, and not
patched like I do here.

I used this experiment to see how much we can reduce the instruction
count.  The conclusion is that we primarily need to look elsewhere,
since the reduction from ~100k to ~80k is just not good enough anyway.

The change to rotr64() that just happened to get into this patch should
get in, though.

> > We need to figure out why it doesn't get lower.  ~80k is still a lot.
> > Are there many inlined functions and unrolled loops in the .h files?
> 
> there are also blake2 files

Yes.

You need to find out how we can reduce the kernel size more
substantially.  If undesirable function inlining can't be prevented,
this may be a reason to replace some multiple references to a function
with a loop containing a single reference to the function.  e.g.:

func(1);
func(2);

may be replaced with:

#pragma unroll 1
for (i = 1; i <= 2; i++)
	func(i);

Of course, in real code things are usually trickier than that, but a
similar approach may often be applied.

All of this is for relatively non performance critical code, so that
we'd have more cache available for the critical code.

Alexander

View attachment "john-opencl-argon2d-roll.diff" of type "text/plain" (8052 bytes)

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ