Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 18 Oct 2012 11:07:43 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: bf-opencl fails self-test on CPU

On 18 Oct, 2012, at 4:38 , Sayantan Datta <std2048@...il.com> wrote:

> HI magnum,
> 
> On Thu, Oct 18, 2012 at 2:49 AM, magnum <john.magnum@...hmail.com> wrote:
> Just try it. Usually it's increadibly simple to add vectorizing. Most of my formats run vectorized on CPU and non-GCN AMD, and scalar on nvidia & GCN. Just a few #ifdefs.
> 
> I always use uint4 or ulong4 (even though those end up in different size). I think once you use eg. uint4 instead of uint, the auto vectorizer may change that to other vector sizes automatically if/when beneficial. That is much less magic than auto vectorization of scalar code.
> 
> I guess you mean I should vectorize the private arrays that have a compile time constant indexing. Is it worthwhile to vectorize the arrays stored in global memory /local memory ? For blowfish I don't have much private arrays with compile time constant indexing. So I made a new kernel that process two hash together using uint2 vectors. I will later try processing four of them together using uint4.

I'm not familiar with BF at all. Maybe it's harder than ususal to vectorize. And I have absolutely no idea if it will be beneficial or not. 

There is no SSE2 support in the CPU format, right?

magnum


[ CONTENT OF TYPE text/html SKIPPED ]

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ