Date: Tue, 20 Mar 2012 22:20:23 +0100 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: RAR format tweaks (was: OpenSSL and AES-NI) On 03/20/2012 12:31 PM, magnum wrote: > On 03/19/2012 10:20 PM, Milen Rangelov wrote: >> Perhaps like more can be achieved by tweaking the RAR decompression routine >> in the libclamav code. I am not that happy with the result though, I put so >> much hope on AES-NI... > > I was planning on concentrating on the very first data block, trying to > detect invalid dictionaries. And just step the unpack code and see if > there are some code paths that are more rigid than we want to. I'd like > to think there are huge gains possible but I'm not sure we'll find them > that easy. I did some tests and research today that showed that the unrar code mostly does what we want it to. This is a good thing, except it means the gains I hoped for will probably not be there. Actually, it bails out as-is after looking at just 15-20% of the data on average. It could be a lot worse. > Oh btw, I think I know one thing already, but haven't tested yet. The > very first bit of the decrypted data tells you if it's LZSS or PPMII. > But I think I saw somewhere in the code that if it was supposed to be > PPMII but that engine detects an error, it tries to fall back to LZSS > instead of aborting. This kind of behaviour is precisely what we're > looking after! This too was a red herring (it does abort), as well as the below. > Also, I think I saw a suspicious function name in Valgrind that I'll get > back to. It was like "restart" something. That too just might be some > kind of rigidness we don't want. So, back to square one. Meanwhile I'm trying to figure out how to deal with -p mode best: My current code calculates the AES key and IV in GPU, and does the rest in CPU (multi-threaded). I'm not sure how to auto-scale OMP vs. GPU for best balance. There's no point in calculating 5000 keys in one second if it takes 9 seconds to verify them afterwards. You might have one CPU core, or 96 of them. For -hp mode this problem could be mitigated by decrypting in GPU, but I don't except it to be faster, just "easier". Actually I think -hp mode will do just fine with the current code (I haven't tested on GPU yet). Finally, the more I look at the unrar code, the smaller it gets, and I'm starting to think I could migrate all of it to OpenCL. Maybe not a one beer job though. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.