Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 30 May 2012 07:17:37 +0530
From: SAYANTAN DATTA <std2048@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Sayantan: Weekly Report #6

On Wed, May 30, 2012 at 1:54 AM, Solar Designer <solar@...nwall.com> wrote:

> Wow.  That's almost one half the CPU speed now.  This is starting to
> make sense.
>

> Are you analyzing the generated code for blowfish-opencl?
>
> Yes,I'm. According to AMD profiler ratio of ALU Instrurction to Fetch
Instruction is just 3.29. Also ALUBusy parameter remains quite low compared
to other kernels, which probably indicate that the implementation is
limited by the memory bandwidth.


> What memory type(s) does it use for S-boxes?
>

I'm using global memory because not all GPUs have sufficiently large LDS to
keep the ALUs busy.  Also I've included some control parameter which would
eliminate channel conflicts to some extent when set properly. Due to the
random memory access pattern it is nearly impossible to eliminate channel
conflicts entirely. There is some trade of between channel conflicts and
ALU utilization. If you set KPC too low channel conflicts would be reduced
but on the other hand ALU utilization falls. Setting it too high would do
the opposite. So, KPC must be set to some optimal value.


> How many instances of bcrypt are you computing in parallel on 7970?
> If more than 512, then what happens if you reduce it to 512 (the maximum
> that may fit in Local Data Share) or lower?  (Yes, this means that we'd
> be using at most 25% of the processing capacity, yet it might be the best
> option.)
>

I haven't tried using LDS but seems like a better option for 7970. Also I'm
wondering if I could use the 256KB GPR storage per CU for storing the
S-boxes.

[ CONTENT OF TYPE text/html SKIPPED ]

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ