Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 22 Oct 2013 14:10:06 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: OpenCL vectorizing how-to.

On 2013-10-22 04:08, Lukas Odzioba wrote:
> 2013/10/21 magnum <john.magnum@...hmail.com>:
>> One thing I can't understand is why pre-vectorized code with the correct
>> width is not used "as-is" by these compilers. Apparently the compiler first
>> scalarizes it and then re-vectorizes it - with very poor results, at least
>> on Well. OTOH this isn't a problem now that we can supply the requested
>> [lack of] width.
>
> "(...)We're likely to generate better code(...)" :)
>
> I guess better is not necessarily faster :)
>
> http://www.youtube.com/watch?feature=player_detailpage&v=QsoLyvvhRuc#t=853

Thanks, that was interesting. He did not fully answer my question 
though. If I supply a kernel with the native width of the device, they 
could compile it to eg. AVX2 or AVX512 instructions right away with no 
added execution masks or other overhead. In some cases even the key 
buffer as supplied from host code can be vectorized (and we could do the 
same with the output buffer) - but if they ask me for scalar code they 
will obviously get a scalar buffer so the end result will be 
unnecessarily complicated vectorized code dealing with that. This 
particular case is not currently affecting any inner loop though.

I think auto-vectorizing is a really great thing but I can't see why 
they refuse to use pre-vectorized code as supplied. Apparently the 
assumption is that noone vectorizes (or should vectorize) for 
performance, only for the problem domain, as he puts it at ~15:05. Maybe 
in the long run they are right. Or maybe future versions of their 
optimizers will be able to analyze pre-vectorized code and decide 
whether it could be used without "re-vectorizing".

magnum

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ