Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 17 Jun 2013 09:19:14 +0530
From: Sayantan Datta <std2048@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Mask mode for GPU

On Saturday 15 June 2013 08:15 PM, Solar Designer wrote:
> For example, in a given kernel invocation the host might provide strings
> "aaaaaa.." to "aaaapq..", where ".." are any placeholder chars, which
> the GPU will overstrike anyway, and the GPU (separately in each
> work-item) will iterate from "aa" to "zz" in the last two character
> positions.  On the next kernel invocation, the host will start with
> "aaaapr..".  (By choosing the weird end/start of range for these two
> kernel invocations, except for the very first invocation's range start,
> I illustrate that they don't have to be at prettier looking points in
> the keyspace, and in practice they generally won't be.  Just whatever
> fits in the optimal GWS.)

Generating password for each work item on host and repeating aa to zz in 
each work item doesn't works very good with descrypt. There is a lot of 
performance loss(30-40%) due to looping inside the kernel. But I think 
this works great with md5 and other fast hashes(haven't tested myself 
though). So for descrypt I think we should generate one key on the host 
and generate the remaining on GPU with each work item generating a 
different set of password.  Although this doesn't require any 
specialized code outside the format but inside the format we need some 
tweaking. One key issue with this is fitting the mask with GWS.
For example :  say we have a mask with following character count 26 x 26 
x 18 x 10 x 9 x 2= 2190240. However optimal GWS is  2097152. The mask 
has a little higher key count. One thing we could do is halve the number 
of keys keeping GWS intact which is nearly almost 50% inefficient in 
this case. Otherwise increase the GWS slightly to make it almost 100% 
efficient. But still we might have extreme cases like 26x26x26x26x26 = 
456976 x 26. In this case either we need to drop GWS by 1/4 which would 
cause significant( nearly 20%) performance loss or increase it by almost 
5x which again isn't very good from the perspective of responsiveness. 
So the performance is quite dependent on what mask we are provided with. 
Another approach could be dividing the mask into submasks such as 
(4x26x26x26x26)x6 + 2x26x26x26x26

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.