Date: Sat, 15 Jun 2013 18:45:57 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Mask mode for GPU On Sat, Jun 15, 2013 at 09:32:32AM +0530, Sayantan Datta wrote: > On Saturday 15 June 2013 06:36 AM, Solar Designer wrote: > >As to introducing support for format's set_mask() into this - now that's > >possibly more difficult than it would be with a specialized implementation. > >Yet I think we should not give up on this approach. Perhaps we'd have > >to untie mask mode from rpp, but we may nevertheless start by duplicating > >much of rpp's structure and initially even code - and only then proceed > >to customize it for optional use of set_mask(). > > I looked into the patch. You are using rpp to generate passwords on cpu > even though rpp was primarily meant to process rules which are very > similar to password generation. But if I understand correctly,we need > only the set of characters for each place holder that will be used on > gpu to generate the required password. I should find a way to do that > using rpp's format, right? You may. Actually, there's nothing to "find" - it's obvious. You just take ctx.ranges[i].chars. Maybe we need to generalize rpp some further and introduce into it ability to skip iterating over some of the ranges, leaving that for set_mask() (presumed to be done in the caller of rpp). To avoid confusion, we could rename it from rpp into something different - a name that would be fine for both uses at once (rules preprocessing and mask mode). For now, though, I suggest that you keep your changes to rpp or rpp-derived code to a minimum (to the extent possible) and focus on the GPU side of things - actual implementations of the functionality needed for set_mask(). Frankly, I don't expect sufficiently clean host side mask mode code from you - I'd expect to have to rewrite it anyway - so what you need to provide is some working throw-away implementation that would show what functionality I'd need to implement in a clean fashion. > Also we need to parallelize rpp's algorithm of password generation for > gpu SIMDs. Not quite. While we sort of have this issue for --node/--fork, we don't really have it for fast hashes on GPU, which is where we need on-GPU set_mask(). Rather, those hashes and those GPUs are so fast that we'd get acceptable kernel running times when we simply iterate over some character ranges for some character positions (perhaps two or so) inside each work-item. That's what myrice did (with hard-coded ranges for two character positions) with raw-md5, and it worked well. Parallelization thus comes from the host iterating over the rest of character positions. For example, in a given kernel invocation the host might provide strings "aaaaaa.." to "aaaapq..", where ".." are any placeholder chars, which the GPU will overstrike anyway, and the GPU (separately in each work-item) will iterate from "aa" to "zz" in the last two character positions. On the next kernel invocation, the host will start with "aaaapr..". (By choosing the weird end/start of range for these two kernel invocations, except for the very first invocation's range start, I illustrate that they don't have to be at prettier looking points in the keyspace, and in practice they generally won't be. Just whatever fits in the optimal GWS.) Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.