Date: Mon, 17 Jun 2013 10:10:35 +0530 From: Sayantan Datta <std2048@...il.com> To: john-dev@...ts.openwall.com Subject: Re: Mask mode for GPU On Monday 17 June 2013 09:49 AM, Solar Designer wrote: > Sayantan, > > On Mon, Jun 17, 2013 at 09:19:14AM +0530, Sayantan Datta wrote: >> >Generating password for each work item on host and repeating aa to zz in >> >each work item doesn't works very good with descrypt. There is a lot of >> >performance loss(30-40%) due to looping inside the kernel. But I think >> >this works great with md5 and other fast hashes(haven't tested myself >> >though). > I don't see why it would result in extra overhead for descrypt (and > compared to what). Can you explain this in some detail, perhaps with > examples? In a little experiment I simply put the kernel inside a 10 iter loop. Nothing else is changed but somehow I was only able to 20M c/s(with updated *pcount) whereas I can get around 80M c/s with just one loop. I think it may be problem with the opencl compiler trying to unroll the 10 loops causing i-cache overrun. > >> >So for descrypt I think we should generate one key on the host >> >and generate the remaining on GPU with each work item generating a >> >different set of password. > I don't see how what you wrote here is different from the approach I had > proposed. You're talking about generating a set of passwords (not just > one password) per work-item anyway. In my approach I was only going to generate 32 keys per work item and avoid any loop inside the kernel for descrypt. Whereas in your approach we would be generating 26*26 passwords per kernel requiring (26*26)/ 32 kernel iterations per kernel invocation. Regards, Sayantan Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.