Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [day] [month] [year] [list]
Date: Mon, 25 Jun 2012 18:11:48 -0700
From: Bit Weasil <>
To: magnum <>
Subject: Re: Re: OpenCL kernel max running time vs. "ASIC hang"

> I simply store the intermediate values in the GPU global memory.
>> The access (if done sanely) is coalesced, and is roughly speaking a
>> "best case" memory access pattern for both the load and the store.
>> I'm using a high resolution timer class to dynamically adjust the
>> work done per kernel invocation.  If I'm below 90% or above 110% of
>> my target time, I adjust the steps per invocation for the next call.
>> It seems to work nicely, and also properly handles conditions like an
>> overheating GPU that throttles, or someone gaming in the background.
> You make it sound very easy :)

I try.  I started my kernels on CUDA, with a display - so I had to do
this.  Once you design it into the kernels, it's not that bad.  I reuse the
same timing code for almost everything.

>  It shouldn't be difficult to take a single execution kernel and break
>> it into multiple steps.  If you would like a starting point, the
>> Cryptohaze tools have this done for all the GPU kernels - feel free
>> to take a look around.
> Thanks, I will do that!

Feel free!  I'll definitely dig through the OpenCL kernels & see if your
algorithms are faster implemented than mine. :)

> magnum

Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.