Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 3 Mar 2012 00:50:35 -0500
From: Milen Rangelov <gat3way@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Re: [JtR PATCH] Support rar's -p mode by spawning
 external unrar process.

It's hashkill but it's not public yet (and I am still working on it). -hp
archives are cracked at ~7500 c/s. With -p mode it depends a lot on the
file size, small files like 100KB are slow, like 1000-2000 c/s.

I am still trying to optimize the ocl kernel more though. One thing I am
trying to eliminate is that byte order reversals that I do for SHA1 inputs
that could be eliminated.

Overall, using 2-component vectors and part of the data kept in __local
shows the best performance for me. The biggest problem there being GPR
usage. You basically balance between ALUBusy (the more local memory, the
less ALUBusy), ALUPacking (the more vectorization, better ALUPacking but
also more GPRs used) and also (surprisingly) the NDRange size (cause larger
NDRanges mean more wavefronts and also mean we are getting close to the
moment where the AMD driver just shuts off with "ASIC hang" or something.
BTW my kernel with ndrange of 64*64 executes for 2-3 seconds on my 6870,
that's the slowest kernel I've ever coded :)

On Fri, Mar 2, 2012 at 8:38 PM, magnum <john.magnum@...hmail.com> wrote:

> On 03/02/2012 02:35 AM, Milen Rangelov wrote:
> > I've done that in my project (accomodating code from unrar).
>
> Is that hashkill or another project? Can we use parts of this code?
>
> > It's just evil. Not that bad on CPUs because setcryptkeys is the real
> > big bottleneck there, but on GPUs speed can get 10 times or even slower
> > as compared to -hp mode. I also know your work on ZIP, borrowed the idea
> > from you if you don't mind :) But in reality, RAR is worse.
> > Statistically, for a ~300KB file in archive, somewhat ~60-70% of the
> > candidates do decompress without an error until they get ruled out by
> > the crc check. It's bad.
>
> I'd be quite happy just to have it running at all without spawning
> unrar, on CPU. Then we go from there.
>
> > Funny thing I found experimentally is that AES decryption is the bigger
> > problem in my code (I use OpenSSL's routines). I am really thinking of
> > some way to do that on GPUs for the first chunk read, but again that's
> > not very much useful. I guess a fast, AES-NI accelerated decryption
> > routine would probably make some difference.
>
> If I could get the AES decryption to show any significance in profiling,
> I would be delighted :) But things might look completely different on
> GPU of course.
>
> What figures are you getting for -hp mode on GPU?
>
> magnum
>
>

Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.