Date: Sun, 25 Mar 2012 00:59:12 +0200 From: Milen Rangelov <gat3way@...il.com> To: john-dev@...ts.openwall.com Subject: Re: OpenCL vector tactics Hello, On Sun, Mar 25, 2012 at 12:24 AM, magnum <john.magnum@...hmail.com> wrote: > Could someone brave tell me why/how/if to use vector types like uint4 in > OpenCL kernels? I do understand sse2 intrinsics, like sse-intrinsics.c > in john, but as far as I understand, that does not translate well to > OpenCL. I don't quite get how to apply vectors to OpenCL. > It's pretty easy. Just declare the variables as vector types, all arithmetic and bitwise operators work on them (operate on all their elements I mean). > How would I attack it? Would each kernel do four inputs and produce four > outputs? Correct. > If I have an existing kernel that does one input and ends up > with one output, should I just convert it to do four things at a time > for every move it makes, just like sse2 intrinsics? Just like that? And > if so, what would I set as local workgroup size? The real size divided > by four? > Just like that, yes. Vectors should not have any impact on workgroup size by themselves. > And what would this accomplish? Did I understand right that this would > benefit CPUs and perhaps AMD GPUs but likely not nvidia? Would it be > detrimental for nvidia? > It might benefit VLIW architecture AMD GPUs, but not always GCN ones. It might not or just partly benefit for some algos (especially those you refer as "slow" hashes, you gotta be careful there). NVidia would not benefit from vectorization due to increased GPR usage, however sm_21 GPUs like 560Ti do benefit. Regards Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.