Date: Mon, 21 Jul 2014 10:30:32 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: ZedBoard: bcrypt Katja, On Sun, Jul 20, 2014 at 04:10:01PM +0200, Katja Malvoni wrote: > I'll go for 5 BRAMs/instance with storing initial S-box values across > unused halves of 4 BRAMs holding S-boxes. This way, initialization will > require 256 clock cycles. Did you mean 512 clock cycles? When you're using only 4 (out of 5) BRAMs for initial S-box values, and those BRAMs also contain the actual S-boxes, you're limited to a total of 8 BRAM accesses per cycle. This can be 4 reads and 4 writes. If so, with two bcrypt instances to initialize and doing 4 writes per cycle, you need 512 cycles to write them all (a total of 2048 32-bit values). Some other split can help: since the initial S-box contents are the same for the two instances, using an equal number of reads and writes per cycle isn't optimal in terms of clock cycles needed (but might be optimal in terms of simplicity and resource utilization). > I'm storing 2 S-boxes in higher half of each of 4 > BRAMs. Initialization data is stored twice but I can copy it in parallel > for both instances. I don't use additional BRAMs and although utilization > will be higher, it won't impact max core count (wider buses were used in > 112 instances approach and core count was limited by available BRAM). I think you can copy in parallel for both instances while reading a shared copy. No need to store (and read) the initial values twice per BRAM for that. Anyway, 512 clock cycles is low enough for our current experiments. It's only 1.5% of total computation time for cost 5. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.