Date: Mon, 02 Apr 2012 22:08:41 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: fast hashes on GPU On 04/01/2012 08:17 PM, magnum wrote: > On 04/01/2012 07:46 PM, Solar Designer wrote: >> On Sun, Apr 01, 2012 at 06:21:14PM +0200, magnum wrote: >>> On 03/31/2012 02:19 PM, Solar Designer wrote: >>>> ...Oh, I just got it to: >>>> >>>> Many salts: 38062K c/s real, 38062K c/s virtual >>>> Only one salt: 26270K c/s real, 26270K c/s virtual >>>> >>>> by simply adding "#pragma unroll 64" before the last loop in >>>> sha512_block(). >>> >>> Did you add just that very pragma line, >> >> Yes. There were similar lines for nearby loops, but somehow not for >> that one yet. >> >>> or did you also add something >>> like "#pragma OPENCL EXTENSION cl_nv_pragma_unroll : enable" somewhere >>> as well? >> >> No. This was CUDA code, not OpenCL. > > Ah, yes. > > Lukas, or anyone, could you explain how to use pragma unroll in OpenCL? > I don't seem to get any impact from it (whereas manual unrolling > provides a significant speedup). I think I got this straight now. For nvidia to honor #pragma unroll directives in OpenCL, you need to add this once (near top) too: #pragma OPENCL EXTENSION cl_nv_pragma_unroll : enable The crux is that this will make eg. the AMD compiler to not only warn (like Intel), but crash and burn. Wat! But here is the solution according to nvcc docs: #ifdef __CUDACC__ #pragma OPENCL EXTENSION cl_nv_pragma_unroll : enable #endif BTW, AMD do support pragma unroll without any special extension lines, and it even confirms it in the compiler log. Apparently Intel do not support pragma unroll at all. In my upcoming RAR format, the manual unrolls are not very extreme so I will probably keep them. The most important one will never be unrolled that well by a pragma anyway - it's 10 unrolls and then an unroll factor of 2. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.