Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 17 Oct 2012 23:19:18 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: bf-opencl fails self-test on CPU

On Mon, Aug 13, 2012 at 9:41 AM, Solar Designer <solar@...nwall.com> wrote:
>> Build started
>> Kernel <blowfish> was not vectorized
> 
> BTW, is there any way to target future Intel CPUs (those with AVX2)
> with Intel's OpenCL SDK and see if this kernel would be vectorized then?

BTW, that message is very confusing. It really only means that there was no *auto* vectorization from *scalar* code performed. Took me a while to figure out. When I add vector types to my formats, that message typically goes from "was successfully" to "was not" yet the speed increases with 1.5x or much more.

On 16 Oct, 2012, at 20:01 , Sayantan Datta <std2048@...il.com> wrote:
> I was looking for opencl cpu optimizations targeting  sse but couldn't get a proper answer. So should I try vectorizing the bf kernel for cpu? If I'm targeting sse  then what should be the vector length?

Just try it. Usually it's increadibly simple to add vectorizing. Most of my formats run vectorized on CPU and non-GCN AMD, and scalar on nvidia & GCN. Just a few #ifdefs.

I always use uint4 or ulong4 (even though those end up in different size). I think once you use eg. uint4 instead of uint, the auto vectorizer may change that to other vector sizes automatically if/when beneficial. That is much less magic than auto vectorization of scalar code.

magnum
Content of type "text/html" skipped

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ