Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 14 Apr 2014 12:28:09 +0200
From: Katja Malvoni <>
Subject: Re: ZedBoard: bcrypt

On 13 April 2014 10:23, Solar Designer <> wrote:

> > Redesigning PS-PL communication resulted in improvement. I have working
> > design with 70 bcrypt cores. Performance is 2162 c/s on 71 MHz
> frequency. 2
> > cycles are needed for one Blowfish round. Computation on host is
> overlapped
> > with computation on FPGA 5/6th of the time.
> Off-list, you had reported 67 cores at 71 MHz doing 1895 c/s when you
> had 4 cycles/round, or so I understood you (possibly incorrectly since
> this data was split across multiple e-mails).

That's correct.

> 70 cores would be doing
> something like 1895*70/67 = 1980 c/s.  At 2 cycles/round, the speed
> should be almost twice that, but you're reporting "only" 2162 c/s.  Why
> is that?

The problem is computation on host. Lines 649 to 681 in
0.016221s to compute while FPGA computation takes 0.011521s. Another
problem are data transfers to/from FPGA. Each transfer takes
around 0.006414s (these numbers are for cost 5). First all the data gets
ready and transferred to FPGA and after that computation is started. When
all 70 cores finish, data is transferred back to host.
I should change that so that cores start computing as soon as data is ready.

> It makes sense to start by running some more benchmarks, though: what
> speeds are you getting for 1 core (in FPGA), for the 2-cycle and 4-cycle
> versions?  What speeds are you getting for $2a$08 (reduces relative
> cost of host's computation by a factor of 8 compared to $2a$05)?

I can't test speed for 1 core at the moment - I'm not able to ssh to the
Zed system, the error I get is connection refused. The test fails on my
For 4-cycle version, $2a$08 speed is 425 c/s and $2a$12 speed is 32 c/s
(tested on my ZedBoard, self test passes but not all instances return
correct results).

>  > Code: git clone -b master
> Can you also post a summary of what work is done on those two cycles?

cycle 0: compute tmp; initiate 2 S-box lookups
cycle 1: compute new R, L; initiate 2 S-box lookups; initiate P-box lookup

> Are you still getting correct results on my ZedBoard only, but not on
> yours (needing a lower core count for yours)?  And not on Parallella
> board either?  I suspect the limited power / core voltage drop issue.
> At 1.0 V core voltage, even a (peak) power usage of just 1.0 W means a
> current of 1.0 A, so if e.g. a PCB trace has impedance of 0.1 Ohm (I
> think this is too high, but not unrealistic) we might have a voltage
> drop of 0.1 V right there, and that's 10% of total.  That's not even
> considering limitations of the voltage regulator.  (I am assuming that
> there's no voltage sense going back from the FPGA to the voltage
> regulator.  I think there is not.)

That's correct. I'm not getting correct results on my boards. I've tried
using 12V/8A PSU instead of 12V/3A on ZedBoard but that didn't help. I'm
also having problems with gcc crashes (every time it crashes on different
"/usr/lib/gcc/arm-linux-gnueabihf/4.6/include/arm_neon.h:7348:1: internal
compiler error: Bus error
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-4.6/README.Bugs> for instructions.
The bug is not reproducible, so it is likely a hardware or OS problem."

> As discussed off-list, I think you should also proceed with ztex board.
> You mentioned that the documentation wasn't of sufficient help for you
> to get communication going, right?  If so, suggest that you work
> primarily from working code examples, such as those for Bitcoin and
> Litecoin mining, as well as with the vendor's SDK examples.

Somehow I missed link to the EZ-USB FX2 Technical Reference Manual. I found
some answers there and I hope to find other answers in the code examples.



Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ