Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 21 Apr 2014 23:46:35 +0200
From: Katja Malvoni <kmalvoni@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: ZedBoard: bcrypt

ᐧ
Hi Alexander,

On 14 April 2014 13:53, Solar Designer <solar@...nwall.com> wrote:

> I think it might make sense to interleave multiple instances of bcrypt
>  per core until you're making full use of all BRAM ports for computation.
>
> With 4 bcrypt instances per core, you need 20 reads per round.  With 2
> cycles/round, that's 10 reads per cycle, needing 5 BRAMs.  Maybe you can
> have:
>
> Cycle 0:
> initiate S0, S1 lookups for instances 0, 1 (total: 4 lookups)
> initiate S2, S3 lookups for instances 2, 3 (total: 4 lookups)
> initiate P lookups for instances 0, 1 (total: 2 lookups)
> (total: 10 lookups)
> Cycle 1:
> initiate S2, S3 lookups for instances 0, 1 (total: 4 lookups)
> initiate S0, S1 lookups for instances 2, 3 (total: 4 lookups)
> initiate P lookups for instances 2, 3 (total: 2 lookups)
> (total: 10 lookups)
>

During last weak I had some other obligations but today I finally found
time to implement this: 28 cores fit (112 instances), limited by BRAMs.
Utilisation is:

Number of Slice Registers:                        13,445 out of 106,400
12%
Number of Slice LUTs:                              47,953 out of  53,200
90%
Number of occupied Slices:                       13,066 out of  13,300   98%
Number of RAMB36E1/FIFO36E1s:           140 out of     140  100%

On my ZedBoard, I get performance of 1855 c/s on self test - lower than 70
cores because of data transfers. I'm not able to test this on the Zed
system. I think that system reboots when I try to run self test. Connection
is closed and when I ssh again, there is no /dev/xdevcfg file and I have to
reload the bitstream.
When I try other tests on my ZedBoard it locks-up.

I think it's reasonable to stop optimising at this point. Having 6
instances/7 BRAMs per core would allow to fit 120 instances but since 112
instances draw too much power, 120 won't work properly either. But I'll try
to generate bitstreams for lower number of cores, I hope it will work for a
number larger than 70.

Katja

Content of type "text/html" skipped

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ