Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 17 Jun 2013 10:10:35 +0530
From: Sayantan Datta <>
Subject: Re: Mask mode for GPU

On Monday 17 June 2013 09:49 AM, Solar Designer wrote:
> Sayantan,
> On Mon, Jun 17, 2013 at 09:19:14AM +0530, Sayantan Datta wrote:
>> >Generating password for each work item on host and repeating aa to zz in
>> >each work item doesn't works very good with descrypt. There is a lot of
>> >performance loss(30-40%) due to looping inside the kernel. But I think
>> >this works great with md5 and other fast hashes(haven't tested myself
>> >though).
> I don't see why it would result in extra overhead for descrypt (and
> compared to what).  Can you explain this in some detail, perhaps with
> examples?

In a little experiment I simply put the kernel inside a 10 iter loop. 
Nothing else is changed but somehow I was only able to 20M c/s(with 
updated *pcount) whereas I can get around
80M c/s with just one loop.  I think it may be problem with the opencl 
compiler trying to unroll the 10 loops causing i-cache overrun.

>> >So for descrypt I think we should generate one key on the host
>> >and generate the remaining on GPU with each work item generating a
>> >different set of password.
> I don't see how what you wrote here is different from the approach I had
> proposed.  You're talking about generating a set of passwords (not just
> one password) per work-item anyway.

In my approach I was only going to generate 32 keys per work item and 
avoid any loop inside the kernel for descrypt. Whereas in your approach 
we would be generating 26*26 passwords per kernel requiring (26*26)/ 32 
kernel iterations per kernel invocation.


Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.