Date: Tue, 10 Jul 2012 20:13:22 +0530 From: Sayantan Datta <std2048@...il.com> To: john-dev@...ts.openwall.com Subject: Re: bf_kernel.cl On Tue, Jul 10, 2012 at 11:41 AM, Sayantan Datta <std2048@...il.com> wrote: > > > On Tue, Jul 10, 2012 at 11:32 AM, Solar Designer <solar@...nwall.com>wrote: > >> This also makes sense. Are you committing this change? I think it >> makes the code simpler, although it needs 3 extra registers per bcrypt >> instance. We should have plenty of spare registers since we're >> under-utilizing the GPU anyway (assuming that this OpenCL code is being >> run on a GPU). >> > > Okay I'll do it now. Also I would start working on the global and local > memory combined implentation today. > > Regards, > Sayantan > > http://www.openwall.com/lists/john-dev/2012/05/14/1 I was looking at the IL generated on 7970 using LDS only. Each Encrypt call has approximately 540 instruction at IL level. However according to your previous estimates each Encrypt call has 16*16+4+5 = 275 rusling in an estimated speed of 52K c/s. Since the number of instruction is doubled we should expect at least half of your previous estimates say roughly 26K c/s. But we are nowhere near that. I guess your previous estimates were based on the fact that each instruction takes 1 clock cycle to execute, is it? But it looks like not all instructions rquire same number of clock cycle on gpu. Sayantan Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.