Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 14 Apr 2014 16:38:57 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: ZedBoard: bcrypt

On Mon, Apr 14, 2014 at 03:53:50PM +0400, Solar Designer wrote:
> With 4 bcrypt instances per core, you need 20 reads per round.  With 2
> cycles/round, that's 10 reads per cycle, needing 5 BRAMs.  Maybe you can
> have:
> 
> Cycle 0:
> initiate S0, S1 lookups for instances 0, 1 (total: 4 lookups)
> initiate S2, S3 lookups for instances 2, 3 (total: 4 lookups)
> initiate P lookups for instances 0, 1 (total: 2 lookups)
> (total: 10 lookups)
> Cycle 1:
> initiate S2, S3 lookups for instances 0, 1 (total: 4 lookups)
> initiate S0, S1 lookups for instances 2, 3 (total: 4 lookups)
> initiate P lookups for instances 2, 3 (total: 2 lookups)
> (total: 10 lookups)
> 
> with the computation also spread across the two cycles as appropriate
> (and maybe you can reuse the same 32-bit adders across bcrypt instances,
> although the cost of extra MUXes is likely to kill the advantage).

BTW, in the example above the data fits into different BRAMs in an
intuitive manner: each one of the four bcrypt instances can have its
S-boxes in its own BRAM block, with nothing else in it.  The four
P-boxes all fit in a fifth BRAM block.  So this may be the winning
approach in terms of simplicity and elegance.

Alexander

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ