Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 20 Apr 2012 17:00:50 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: Pragma unroll

On 04/03/2012 11:45 PM, magnum wrote:
>> I think I got this straight now. For nvidia to honor #pragma unroll
>> directives in OpenCL, you need to add this once (near top) too:
>>
>> #pragma OPENCL EXTENSION cl_nv_pragma_unroll : enable
>>
>> The crux is that this will make eg. the AMD compiler to not only warn
>> (like Intel), but crash and burn. Wat! But here is the solution
>> according to nvcc docs:
>>
>> #ifdef __CUDACC__
>> #pragma OPENCL EXTENSION cl_nv_pragma_unroll : enable
>> #endif
> 
> It turned out this does not work at all. That macro is only defined in
> the CUDA compiler, not OpenCL. I'm not sure why anyone would need to
> test for nvidia when compiling CUDA...
> 
> Unfortunately I haven't found any other macro that can be used.

If anyone cares, here is the correct one:

#ifdef cl_nv_pragma_unroll
#pragma OPENCL EXTENSION cl_nv_pragma_unroll : enable
#endif

In fact the spec guarantees this: "Every extension which affects the
OpenCL language semantics, syntax or adds built-in functions to the
language must create a preprocessor #define that matches the extension
name string. This #define would be available in the language if and only
if the extension is supported on a given implementation."

magnum

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ