Date: Mon, 25 Jun 2012 18:11:48 -0700 From: Bit Weasil <bitweasil@...il.com> To: magnum <john.magnum@...hmail.com> Cc: john-dev@...ts.openwall.com Subject: Re: Re: OpenCL kernel max running time vs. "ASIC hang" > I simply store the intermediate values in the GPU global memory. >> The access (if done sanely) is coalesced, and is roughly speaking a >> "best case" memory access pattern for both the load and the store. >> I'm using a high resolution timer class to dynamically adjust the >> work done per kernel invocation. If I'm below 90% or above 110% of >> my target time, I adjust the steps per invocation for the next call. >> It seems to work nicely, and also properly handles conditions like an >> overheating GPU that throttles, or someone gaming in the background. >> > > You make it sound very easy :) > I try. I started my kernels on CUDA, with a display - so I had to do this. Once you design it into the kernels, it's not that bad. I reuse the same timing code for almost everything. > > It shouldn't be difficult to take a single execution kernel and break >> it into multiple steps. If you would like a starting point, the >> Cryptohaze tools have this done for all the GPU kernels - feel free >> to take a look around. >> > > Thanks, I will do that! > Feel free! I'll definitely dig through the OpenCL kernels & see if your algorithms are faster implemented than mine. :) > > magnum > > > > Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.