Date: Wed, 21 Mar 2012 00:04:23 +0200 From: Milen Rangelov <gat3way@...il.com> To: john-dev@...ts.openwall.com Subject: Re: RAR format tweaks (was: OpenSSL and AES-NI) Hello magnum, > I did some tests and research today that showed that the unrar code > mostly does what we want it to. This is a good thing, except it means > the gains I hoped for will probably not be there. > In fact I wasn't very much optimistic about it, but it was like my last hope :) So, back to square one. Meanwhile I'm trying to figure out how to deal > with -p mode best: My current code calculates the AES key and IV in GPU, > and does the rest in CPU (multi-threaded). I'm not sure how to > auto-scale OMP vs. GPU for best balance. That's exactly what I do too. SetCryptKeys on GPU and AES decryption and decompression on host. That's also what (AFAIK) igrargpu does. > There's no point in calculating > 5000 keys in one second if it takes 9 seconds to verify them afterwards. > You might have one CPU core, or 96 of them. > True :) > For -hp mode this problem could be mitigated by decrypting in GPU, but I > don't except it to be faster, just "easier". Actually I think -hp mode > will do just fine with the current code (I haven't tested on GPU yet). > I don't think so. With -hp mode, decryption is quite fast related to the key strengthening part, even if it was performed on a fast GPU. On the other hand, doing AES in the kernel would increase the GPR pressure even more. Although it's worth investigating whether writing the key/IVs in global memory, then running a second AES-only kernel to decrypt the header using them would make a difference. Anyway, for that small amount of data to decrypt, I think kernel launch/__global memory latency would be an overkill. > Finally, the more I look at the unrar code, the smaller it gets, and I'm > starting to think I could migrate all of it to OpenCL. Maybe not a one > beer job though. IMO that's a bit optimistic. While I think it's possible, I see a lot of reasons why porting unrar to opencl would end up a freaking massacre :) It's just too much branching, too much lookup tables and inability to vectorize, besides a lot of data needs to be pushed through clenqueuewritebuffer and then decrypted and decompressed by the kernel. I think that's challenging thing to do, but it would definitely cost me weeks, more likely months of efforts (I am doing that in my free time and I also have a family to take care of :) ). Well IMO, rar is just evil and I don't think it would take long before I give up, at least for now. I just wasted so much time on it and it's getting rather annoying :) Regards Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.