Date: Sat, 29 Aug 2015 02:22:02 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: LWS and GWS auto-tuning On 2015-08-28 21:53, magnum wrote: > I'm currently experimenting with sorting keys by length for RAR-opencl > and these issues made a lot of noise. I'm starting to wonder if that whole idea is flawed: For SIMD CPU it's a whole other story, but in this case, we'll have say 128K keys and the longest of them will dictate the total run time. No matter how we sort them (and do the shorter ones quicker), the longest one will still take that one long amount of time. So the keys we did quicker after sorting just ADDS to the total. Maybe we should just completely ignore sorting? It would be relevant to sort keys if (like in the SIMD case) we collected, say, 4-10x the global work size and then sorted. But in OpenCL that is an outrageous number of keys to keep in memory, it'd be like 30-40 MB. I'm starting to think the old code is good and the actual optimal thing to do is to simply run a mode than gives you one length at a time - like mask mode. Any ideas appreciated, especially from Solar. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.