Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Sun, 6 Sep 2015 16:52:44 +0300
From: Solar Designer <>
Subject: Re: md5crypt-opencl

On Fri, Sep 04, 2015 at 10:43:54AM +0300, Solar Designer wrote:
> Another guess was that byte-sized accesses were causing the array to be
> placed in global memory, due to possible unavailability of such access
> modes for VGPRs (I don't recall whether this is the case or not).
> However, I've also since ruled this out (at least as the only cause), by
> changing the kernel such that there was no longer a single byte-sized
> access left in the generated ISA code.

Despite of the above, now that I play with a heavily cut-down (and thus
non-working) kernel I got it to a point where it uses scratch memory with:

	uint * string;
	for (i = 0; i < 16; i++)
		ctx->buffer[i] = 0;
	for (i = 3; i < 64; i += 4)
		ctx->buffer[i / 4] |= (string[i / 4] >> ((i & 3) * 8)) & 0xff;

and implements the last of these loops with stores/loads, but it doesn't
with that loop changed to:

	for (i = 0; i < 64; i++)
		ctx->buffer[i / 4] |= (string[i / 4] >> ((i & 3) * 8)) & 0xff;

and implements it with shifts and masks.

Testing on a separate, even more heavily cut-down kernel, I determined
that it is in fact possible to have an array of almost 1 KB in VGPRs,
with no uses of scratch memory.  Problems arise when we do anything that
looks like byte-sized accesses, even when those are written with shifts
and masks in the source.

Perhaps in our there are multiple reasons why the
compiler prefers to use scratch memory.


Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ