Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Sat, 24 Mar 2012 23:24:40 +0100
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: OpenCL vector tactics

Could someone brave tell me why/how/if to use vector types like uint4 in
OpenCL kernels? I do understand sse2 intrinsics, like sse-intrinsics.c
in john, but as far as I understand, that does not translate well to
OpenCL. I don't quite get how to apply vectors to OpenCL.

How would I attack it? Would each kernel do four inputs and produce four
outputs? If I have an existing kernel that does one input and ends up
with one output, should I just convert it to do four things at a time
for every move it makes, just like sse2 intrinsics? Just like that? And
if so, what would I set as local workgroup size? The real size divided
by four?

And what would this accomplish? Did I understand right that this would
benefit CPUs and perhaps AMD GPUs but likely not nvidia? Would it be
detrimental for nvidia?

magnum

Powered by blists - more mailing lists

Your e-mail address:

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.