Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 5 Jun 2011 03:32:27 +0400
From: Solar Designer <solar@...nwall.com>
To: crypt-dev@...ts.openwall.com
Subject: Re: alternative approach

David,

On Mon, May 30, 2011 at 10:52:44AM -0700, David Hulton wrote:
> There is a little bit of overhead but not much, most of the space is
> taken up by the S-Boxes. We currently have 36 fully pipelined DES
> cores on our 22 billion/sec image. Each core runs at 600MHz.

Wow.  This is where the constant S-boxes and ability to do the 8 S-box
lookups in parallel help.  We're not going to achieve clock rates this
high for our Blowfish-like stuff... although I might try to come up with
a Blowfish-like construct having more parallelism in it.

> > Given the numbers Yuri posted, it appears that a XC6VLX240T would
> > outperform Core i7-2600 at bflike by a factor of 200.  Isn't this 5x
> > better than the 40x we had for DES?  This ignores the overhead, though.
> > But on the other hand, there's further room for improvement (add bit
> > permutations, which will slow down software).
> 
> That would be great if we could get a better advantage than DES with
> this bflike algorithm. Was this verified by actually synthesizing
> something close to the proposed algorithm?

Yes, by looking at device utilization by one bflike core and concluding
that we'd be able to fit hundreds of those cores per chip.  However,
this ignores overhead (routing, combining the cores' outputs).

Yuri posted the synthesis results in here - I'd appreciate it if you
take a look and provide your comments.

> > That's what we're trying to do, and we'd appreciate your help with it -
> > specifically, more info on your DES cores, and some advice on
> > generating and understanding a circuit diagram or the like.
> 
> Sure thing. Do we currently have a basic implementation to work from?

Yes, we have this bflike thing in C and Verilog.  It is an early draft,
and it is just one core with no surrounding logic, but we can use it for:

> I think the best thing to do at this point is to synthesize and try
> tweaking to see which design is the most space/speed efficient.

That's what we're doing.

We might end up with two kinds of bflike cores: those using BlockRAMs
and those using the remaining logic (after we're out of BlockRAMs).

Thanks,

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.