Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [day] [month] [year] [list]
Date: Mon, 25 Jun 2012 18:11:48 -0700
From: Bit Weasil <bitweasil@...il.com>
To: magnum <john.magnum@...hmail.com>
Cc: john-dev@...ts.openwall.com
Subject: Re: Re: OpenCL kernel max running time vs. "ASIC hang"

> I simply store the intermediate values in the GPU global memory.
>> The access (if done sanely) is coalesced, and is roughly speaking a
>> "best case" memory access pattern for both the load and the store.
>> I'm using a high resolution timer class to dynamically adjust the
>> work done per kernel invocation.  If I'm below 90% or above 110% of
>> my target time, I adjust the steps per invocation for the next call.
>> It seems to work nicely, and also properly handles conditions like an
>> overheating GPU that throttles, or someone gaming in the background.
>>
>
> You make it sound very easy :)
>

I try.  I started my kernels on CUDA, with a display - so I had to do
this.  Once you design it into the kernels, it's not that bad.  I reuse the
same timing code for almost everything.


>
>  It shouldn't be difficult to take a single execution kernel and break
>> it into multiple steps.  If you would like a starting point, the
>> Cryptohaze tools have this done for all the GPU kernels - feel free
>> to take a look around.
>>
>
> Thanks, I will do that!
>

Feel free!  I'll definitely dig through the OpenCL kernels & see if your
algorithms are faster implemented than mine. :)


>
> magnum
>
>
>
>

[ CONTENT OF TYPE text/html SKIPPED ]

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ