Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Sat, 29 Aug 2015 02:22:02 +0200
From: magnum <>
Subject: Re: LWS and GWS auto-tuning

On 2015-08-28 21:53, magnum wrote:
> I'm currently experimenting with sorting keys by length for RAR-opencl
> and these issues made a lot of noise.

I'm starting to wonder if that whole idea is flawed: For SIMD CPU it's a 
whole other story, but in this case, we'll have say 128K keys and the 
longest of them will dictate the total run time. No matter how we sort 
them (and do the shorter ones quicker), the longest one will still take 
that one long amount of time. So the keys we did quicker after sorting 
just ADDS to the total. Maybe we should just completely ignore sorting?

It would be relevant to sort keys if (like in the SIMD case) we 
collected, say, 4-10x the global work size and then sorted. But in 
OpenCL that is an outrageous number of keys to keep in memory, it'd be 
like 30-40 MB. I'm starting to think the old code is good and the actual 
optimal thing to do is to simply run a mode than gives you one length at 
a time - like mask mode.

Any ideas appreciated, especially from Solar.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.