Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 3 Aug 2012 00:18:34 +0800
From: myrice <qqlddg@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Result of hard core password generation on 7970

On Thu, Jul 26, 2012 at 1:05 AM, Bit Weasil <bitweasil@...il.com> wrote:
> I do not know if my previous email made it through.  I got an error from the
> mailing list server regarding a disk space error.
>
> Apologies if it goes through twice.
>
> The GPU is not a CPU - you cannot treat it like one!  You cannot safely
> treat it as "a bunch of CPUs running in parallel" - this leads to memory
> contention.  It must be coded as a very wide vector engine.
>
> The "hard coded" MD5 kernel is written like CPU code, not like GPU code.
>
> I see no use of local memory.  This is very bad.  Global memory is high
> bandwidth, but very high latency.
>
> You're doing a linear search through the main global memory to check
> passwords, as far as I can tell.  From each thread.  On many AMD GPUs, this
> does not broadcast the read (I believe nVidia will, at least on newer GPUs).
>

Thanks for your advice. I think NVidia could do broadcast but not sure
about AMD GPUs.

> Further, there's not a lookup bitmap in sight.  They're used for very good
> reasons, and are absolutely critical for good performance on the GPU.  I'm
> using a 3 layer bitmap system (local, global-but-cached,
> global-and-not-cached) to make sure I only do a binary search through the
> sorted hashes if there's a very high probability of finding the hash.  A
> walk through global memory space from one thread is incredibly expensive.
>

Yes, this is only first version of GPU password comparison. I am
implemting bitmap on GPU now and try to use local memory.

Thanks
myrice

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ