Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 1 Aug 2013 17:15:12 +0200
From: Katja Malvoni <>
Subject: Re: Parallella: bcrypt

Hi Alexander,

On Wed, Jul 31, 2013 at 4:06 AM, Solar Designer <> wrote:

> > On Tue, Jul 30, 2013 at 4:19 PM, Solar Designer <>
> wrote:
> > > Here's another idea: replace the AND, not the right shift.  You can
> > > replace one AND with two IMULs - e.g., to extract the byte at bit
> offset
> > > 16, you can IMUL by 0x100, then right shift by 24, then IMUL by 4 (to
> > > get the 8 data bits into bit offsets 2 to 9 as we need for a load).
>  Can
> > > you have both IMULs for free with 2x interleave, or would you have to
> go
> > > for 3x?  In the latter case, you wouldn't be able to preload one of
> > > three P arrays, which would defeat the purpose of this new trick for
> one
> > > of two byte extracts - but we'd nevertheless potentially save a cycle
> on
> > > the other byte extract.
> Perhaps you can try implementing the trick for the byte at bit offsets
> 16 to 23, where you can reuse the existing 0xff constant with IMADD?

Unfortunately, problem with IMADD appears again. Register used as result is
added to multiplication so I need to move L to tmp first and than use the
trick. Or use another register for constant but in that case I can't fully
preload both P arrays.


Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.