john-dev - OpenCL vector tactics

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <42590d8718c9be0ded34b13ec257ba88@smtp.hushmail.com>
Date: Sat, 24 Mar 2012 23:24:40 +0100
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: OpenCL vector tactics

Could someone brave tell me why/how/if to use vector types like uint4 in
OpenCL kernels? I do understand sse2 intrinsics, like sse-intrinsics.c
in john, but as far as I understand, that does not translate well to
OpenCL. I don't quite get how to apply vectors to OpenCL.

How would I attack it? Would each kernel do four inputs and produce four
outputs? If I have an existing kernel that does one input and ends up
with one output, should I just convert it to do four things at a time
for every move it makes, just like sse2 intrinsics? Just like that? And
if so, what would I set as local workgroup size? The real size divided
by four?

And what would this accomplish? Did I understand right that this would
benefit CPUs and perhaps AMD GPUs but likely not nvidia? Would it be
detrimental for nvidia?

magnum

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.