Date: Tue, 03 Apr 2012 23:45:58 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Pragma untoll [was: fast hashes on GPU] On 04/02/2012 10:08 PM, magnum wrote: >> Lukas, or anyone, could you explain how to use pragma unroll in OpenCL? >> I don't seem to get any impact from it (whereas manual unrolling >> provides a significant speedup). > > I think I got this straight now. For nvidia to honor #pragma unroll > directives in OpenCL, you need to add this once (near top) too: > > #pragma OPENCL EXTENSION cl_nv_pragma_unroll : enable > > The crux is that this will make eg. the AMD compiler to not only warn > (like Intel), but crash and burn. Wat! But here is the solution > according to nvcc docs: > > #ifdef __CUDACC__ > #pragma OPENCL EXTENSION cl_nv_pragma_unroll : enable > #endif It turned out this does not work at all. That macro is only defined in the CUDA compiler, not OpenCL. I'm not sure why anyone would need to test for nvidia when compiling CUDA... Unfortunately I haven't found any other macro that can be used. We'll probably need ways to tell AMD from nvidia, at least. We could pass a custom -D at build time though. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.