Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 6 Jun 2012 20:07:18 +0530
From: SAYANTAN DATTA <std2048@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: HD 7970 LDS Bank Conflicts

Hi,

On Wed, Jun 6, 2012 at 7:06 PM, Alain Espinosa <alainesp@...il.com> wrote:

> Division and modulus are very expensive. I change to use AND and shift
> and i get 25% increase only by that (changing the algorithm).
>

Yes it's true. But I've used modulus and division at only twice or thrice
per kernel which corresponds to 3/(1024*32) th of total ALU ops per kernel.
So its not a major issue.

Why, 4KB is still under the limit for local memory if enough big
> worksize is used.
>

Using LDS I'm getting 60% better performance than using only global memory
even though the ALU is underutilized. Also I haven't done anything to
minimize LDS bank conflicts. Hoping to get even more performance after this
optimization.

Regards,
Sayantan

Content of type "text/html" skipped

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ