Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 12 Dec 2012 22:08:55 +0100
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: office2013-opencl

On 12 Dec, 2012, at 20:59 , Solar Designer <solar@...nwall.com> wrote:
> magnum -
> 
> More stuff from Twitter:
> ...
> zo4> <@...ardiz> @gat3way @passware Anyhow, I just benchmarked JtR bleeding-jumbo's office2013-opencl on the two cards in bull ...
> zo5> <@...ardiz> @gat3way @passware ... I got 261 c/s on the GTX 570 o/c (need to ask magnum to optimize more!), but a whopping 911 c/s on 7970
> zo6> <@...ardiz> @gat3way @passware Yeah, per @hashcat's benchmarks 6990 is faster than 7970 per-GPU at iterated SHA-512 - but all are close to GTX 580
> zo7> <@...ardiz> @gat3way @passware Split kernel. Looks like as few as 64 iterations per kernel invocation, until the 100k iterations are reached.

Cool! I just took Claudio's SHA-512 as base and optimised it a little for iterated use. Actually I think he has a separate kernel for nvidia that I do not yet use. The format is still just first PoC. So much to do, so little time...

> Also tried 1225 MHz (which previously worked fine for bf-opencl) - got
> ASIC hang after a few minutes, at 82C (although the card can be stable
> at higher temperatures - previously observed up to 88C with other
> formats).  Perhaps some parts of the chip, which bf-opencl did not use
> actively enough, become unstable at 1225 MHz.

Was that with iterations/kernel increased? Maybe you went over that 200ms "death threshold" for each kernel call.

> Observation: it takes maybe a couple of minutes for --test of this
> format to complete on GTX 570, but about 8 minutes on 7970.  Why?


If you run it on both GPUs with (environment) GWS=0, you will see why. Even at the lowest GWS (ie. 64), the full crypt_all() call takes 14 seconds on the 7790.

I haven't tweaked the auto-tuning much yet. I know exactly what to do but haven't had the time. All my iterated formats currently auto-tune with the full number of iterations but there is no need for that, so it could be several orders of magnitude faster. Maybe I should give that some prio...

magnum

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ