Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 30 Jul 2013 20:34:29 +0400
From: Solar Designer <>
Subject: Re: Parallella: bcrypt


On Tue, Jul 30, 2013 at 06:19:57PM +0400, Solar Designer wrote:
> Here's another idea: replace the AND, not the right shift.  You can
> replace one AND with two IMULs - e.g., to extract the byte at bit offset
> 16, you can IMUL by 0x100, then right shift by 24, then IMUL by 4 (to
> get the 8 data bits into bit offsets 2 to 9 as we need for a load).  Can
> you have both IMULs for free with 2x interleave, or would you have to go
> for 3x?  In the latter case, you wouldn't be able to preload one of
> three P arrays, which would defeat the purpose of this new trick for one
> of two byte extracts - but we'd nevertheless potentially save a cycle on
> the other byte extract.

In terms of register usage for the constants, at first it feels like
you'd need two more, for 0x100 and 0x10000.  However, instead of 0x100
with IMUL you may reuse the existing 0xff constant (which you need for
the AND of byte 0) with IMADD.  And if you replace all of the AND
0x3fc's with this new approach, you would no longer need a register with
the 0x3fc constant.  Thus, this approach potentially needs no extra
registers for the constants.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.