Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Sun, 25 Mar 2012 00:59:12 +0200
From: Milen Rangelov <gat3way@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: OpenCL vector tactics

Hello,

On Sun, Mar 25, 2012 at 12:24 AM, magnum <john.magnum@...hmail.com> wrote:

> Could someone brave tell me why/how/if to use vector types like uint4 in
> OpenCL kernels? I do understand sse2 intrinsics, like sse-intrinsics.c
> in john, but as far as I understand, that does not translate well to
> OpenCL. I don't quite get how to apply vectors to OpenCL.
>

It's pretty easy. Just declare the variables as vector types, all
arithmetic and bitwise operators work on them (operate on all their
elements I mean).


> How would I attack it? Would each kernel do four inputs and produce four
> outputs?


Correct.


> If I have an existing kernel that does one input and ends up
> with one output, should I just convert it to do four things at a time
> for every move it makes, just like sse2 intrinsics? Just like that? And
> if so, what would I set as local workgroup size? The real size divided
> by four?
>

Just like that, yes. Vectors should not have any impact on workgroup size
by themselves.



> And what would this accomplish? Did I understand right that this would
> benefit CPUs and perhaps AMD GPUs but likely not nvidia? Would it be
> detrimental for nvidia?
>

It might benefit VLIW architecture AMD GPUs, but not always GCN  ones. It
might not or just partly benefit for some algos (especially those you refer
as "slow" hashes, you gotta be careful there). NVidia would not benefit
from vectorization due to increased GPR usage, however sm_21 GPUs like
560Ti do benefit.

Regards

Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.