Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 1 Apr 2012 21:07:27 +0200
From: Lukas Odzioba <lukas.odzioba@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: fast hashes on GPU

I've never used pragma unroll in OpenCL. According to this:
http://gpgpu.org/2010/03/20/cuda-3-0-toolkit-released

It should be supported for NV.
For AMD what i have googled is that people are having trouble with it,
so possibly something is broken in amd compiler.
http://devgurus.amd.com/thread/158877
 I would suggest use of own unrolling macros:
like
#define U1() something what we want to unroll
#define U2()
U1() U1()
#define U4() \
U2() U2()
and so on..
for powers of 2 it is straightforward, for other numbers you have to
combine more defines to get unroll number you want. The problem is
that we cannot parametrize #define with other #define, but we can
redefine U1 when needed (am I right?)
This is not a nice solutions, but it works, and it is more readable
than hundreds unrolled lines.

Lukas

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.