Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 22 Dec 2011 18:40:38 +0400
From: Solar Designer <>
Subject: Re: Bit slice implementation of DES based hashes

On Thu, Dec 22, 2011 at 05:21:51PM +0530, piyush mittal wrote:
> > Each bit of what?
> Each bit of processor is what I am talking of,like in 64 bit processor each
> 64 bit will behave as different processor.

"each 64 bit"?

> According to Eli Biham's paper,In Des Bitslice implementation he considered
> processor as a SIMD computer,ie as 64 parallel one bit processor computing
> the same instruction.


> > "To further increase the speed"?  Would there be any speed increase at
> > all (compared to a non-bitslice implementation) without this?  And what
> > alternative(s) did we have?
> Yes.Speed has been increased,like in his paper he told:

Sure, but that's not what I asked you above.  You used the word
"further", implying that there would be some speed increase (just a
smaller increase) even without representation of the S-boxes as
Boolean expressions.  In turn, this implies that such bitslice
implementations are possible.  This made me ask (indirectly): are they
possible?  How?  And would they provide any speed increase at all?
If you don't know the answers to these, and you clearly don't, then your
statement about a "further" speed increase was not justified.

Then, it might in fact help you understand this stuff better if you try
to come up with answers to these related questions.  They are actually
pretty difficult.

> In DES, the input of each S box depends only on the output
> of only six S boxes in the previous round. Thus, the code can be optimised
> to start computing the next round while still computing the preceding one.
> This can speed up implementations on pipelined processors, where we can
> compute several instances in parallel.

Yes, but this is a minor and secondary effect, it is by far not the
primary reason for speedup.  Moreover, when there are enough registers
(or when cached memory accesses are fast enough), similar extra
instruction level parallelism can be achieved without bitslicing by
mixing instructions from two non-bitslice implementations of DES (for
two different keys or/and input blocks).  (But no one bothered to do
that, as far as I'm aware, because bitslicing is so much faster.)

Where does the major speed increase from bitslicing DES come from?

> Truly speaking I am not getting how B and K is being initialised in JTR

For a while let's speak of bitslice DES in general, not referring to JtR
specifically.  B[] is a 64-element array, and K[] is a 56-element array.
Each element is an N-bit machine word.  Let's say we have only one DES
block to encrypt with one key.  It is non-optimal to use a bitslice
implementation (unless on a true 1-bit machine) for that purpose, but
let's say we're going to do that anyway.  Where does bit 0 of the block
go?  And where does bit 0 of the key go?  Ditto, say, for bit 11 of each.
Please try to answer these questions.  I tried to make them simpler by
introducing these extra specifics and by moving away from JtR.

Then, supposed we have two DES blocks to encrypt with two different keys.
Where does bit 0 of the second DES block to encrypt go?  Where does bit
0 of the second DES key go?  Ditto, say, for bit 11 of each.

> code according to the architecture of processor specified.And if I will be
> able to get it then i can also  complete my conversions.However I read its
> corresponding header file but then also I am not getting.Even I tried to
> take some idea from DES_bs_get_hash() function where we are fetching hash
> from B but I am not getting what "B[0] DEPTH" represents and why we have
> taken b[0] to b[19].

Let's revisit this later, or maybe it'll become obvious to you after you
answer the non-JtR questions above.


Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ