Date: Sun, 6 Jul 2014 13:20:20 +0200 From: Katja Malvoni <kmalvoni@...il.com> To: john-dev@...ts.openwall.com Subject: Re: ZedBoard: bcrypt ᐧ On 6 July 2014 12:55, Solar Designer <solar@...nwall.com> wrote: > On Sun, Jul 06, 2014 at 11:22:48AM +0200, Katja Malvoni wrote: > > On 6 July 2014 10:15, Solar Designer <solar@...nwall.com> wrote: > > > > > I guess you're computing 64 bits per hash only, correct? This is > > > sufficiently unlikely to cause false positives that we can go with it. > > > > That's correct. But I transfer 64 32-bit values per hash from FPGA after > > computation is done because array being transferred contains structure > > aligned in such way that higher address bits select bcrypt core so > > everything is done with one call to memcpy(). I tried avoiding > unnecessary > > transfers but performance is a bit lower 3744 c/s, I assume because of > > overhead of multiple calls to memcpy(). > > Maybe you can use pairs of 32-bit integer or individual 64-bit integer > reads in place of multiple memcpy()'s. > I'm not sure I understand. I'm using mmaped memory space to access bcrypt logic so if I'm not mistaken, the only way I can read data from that space is by copying it using memcpy(). Or there is another way to perform those reads? In other words, the drop from 960 mV to around 890 mV corresponds to > unreliable cracking, and you don't know what the voltage is when the > cracking is reliable (which it is on my ZedBoard only), right? > On Parallella board, it is 960 mV (tried with lower core count which is reliable). > Perhaps you can achieve a higher clock rate by introducing an extra > cycle of latency per data transfer during initialization and maybe > during transfer of hashes to host as well? Anyway, maybe it's better > to consider that after you've implemented initialization within each > core as I suggested. It depends on which of these things you feel > you'll have ready before your trip to the US. > I'm not sure that would help. Routing delay is 90.4% of longest path delay and I can't use any frequency, just ones that can be derived from PS. So the next one after 71.4 MHz is 76.9 MHz. With 90% of delay being routing I don't think it is possible to improve logic to achieve 76.9 MHz. All these wires are connected to the same AXI bus and distributed along entire FPGA since AXI bus must access BRAMs and every bcrypt instance must access the same BRAMs. In this case, that extra registers need to be on the BRAM inputs and outputs which directly impacts bcrypt computation, namely delays when loading data from S-boxes. Katja Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.