Date: Fri, 20 Apr 2012 17:00:50 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: Pragma unroll On 04/03/2012 11:45 PM, magnum wrote: >> I think I got this straight now. For nvidia to honor #pragma unroll >> directives in OpenCL, you need to add this once (near top) too: >> >> #pragma OPENCL EXTENSION cl_nv_pragma_unroll : enable >> >> The crux is that this will make eg. the AMD compiler to not only warn >> (like Intel), but crash and burn. Wat! But here is the solution >> according to nvcc docs: >> >> #ifdef __CUDACC__ >> #pragma OPENCL EXTENSION cl_nv_pragma_unroll : enable >> #endif > > It turned out this does not work at all. That macro is only defined in > the CUDA compiler, not OpenCL. I'm not sure why anyone would need to > test for nvidia when compiling CUDA... > > Unfortunately I haven't found any other macro that can be used. If anyone cares, here is the correct one: #ifdef cl_nv_pragma_unroll #pragma OPENCL EXTENSION cl_nv_pragma_unroll : enable #endif In fact the spec guarantees this: "Every extension which affects the OpenCL language semantics, syntax or adds built-in functions to the language must create a preprocessor #define that matches the extension name string. This #define would be available in the language if and only if the extension is supported on a given implementation." magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.