Date: Sat, 3 Mar 2012 00:50:35 -0500 From: Milen Rangelov <gat3way@...il.com> To: john-dev@...ts.openwall.com Subject: Re: Re: [JtR PATCH] Support rar's -p mode by spawning external unrar process. It's hashkill but it's not public yet (and I am still working on it). -hp archives are cracked at ~7500 c/s. With -p mode it depends a lot on the file size, small files like 100KB are slow, like 1000-2000 c/s. I am still trying to optimize the ocl kernel more though. One thing I am trying to eliminate is that byte order reversals that I do for SHA1 inputs that could be eliminated. Overall, using 2-component vectors and part of the data kept in __local shows the best performance for me. The biggest problem there being GPR usage. You basically balance between ALUBusy (the more local memory, the less ALUBusy), ALUPacking (the more vectorization, better ALUPacking but also more GPRs used) and also (surprisingly) the NDRange size (cause larger NDRanges mean more wavefronts and also mean we are getting close to the moment where the AMD driver just shuts off with "ASIC hang" or something. BTW my kernel with ndrange of 64*64 executes for 2-3 seconds on my 6870, that's the slowest kernel I've ever coded :) On Fri, Mar 2, 2012 at 8:38 PM, magnum <john.magnum@...hmail.com> wrote: > On 03/02/2012 02:35 AM, Milen Rangelov wrote: > > I've done that in my project (accomodating code from unrar). > > Is that hashkill or another project? Can we use parts of this code? > > > It's just evil. Not that bad on CPUs because setcryptkeys is the real > > big bottleneck there, but on GPUs speed can get 10 times or even slower > > as compared to -hp mode. I also know your work on ZIP, borrowed the idea > > from you if you don't mind :) But in reality, RAR is worse. > > Statistically, for a ~300KB file in archive, somewhat ~60-70% of the > > candidates do decompress without an error until they get ruled out by > > the crc check. It's bad. > > I'd be quite happy just to have it running at all without spawning > unrar, on CPU. Then we go from there. > > > Funny thing I found experimentally is that AES decryption is the bigger > > problem in my code (I use OpenSSL's routines). I am really thinking of > > some way to do that on GPUs for the first chunk read, but again that's > > not very much useful. I guess a fast, AES-NI accelerated decryption > > routine would probably make some difference. > > If I could get the AES decryption to show any significance in profiling, > I would be delighted :) But things might look completely different on > GPU of course. > > What figures are you getting for -hp mode on GPU? > > magnum > > Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.