Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 30 Jul 2013 18:19:57 +0400
From: Solar Designer <>
Subject: Re: Parallella: bcrypt

Katja -

On Thu, Jul 25, 2013 at 08:06:48AM +0400, Solar Designer wrote:
> Maybe you'll come up with another clever/crazy idea on how to do right
> shifts with Epiphany's FPU instructions (like I mentioned, replacing one
> right shift with multiple FPU instructions is OK).

Here's another idea: replace the AND, not the right shift.  You can
replace one AND with two IMULs - e.g., to extract the byte at bit offset
16, you can IMUL by 0x100, then right shift by 24, then IMUL by 4 (to
get the 8 data bits into bit offsets 2 to 9 as we need for a load).  Can
you have both IMULs for free with 2x interleave, or would you have to go
for 3x?  In the latter case, you wouldn't be able to preload one of
three P arrays, which would defeat the purpose of this new trick for one
of two byte extracts - but we'd nevertheless potentially save a cycle on
the other byte extract.

I think you can try using this trick with 2x interleave - perhaps it's
usable in some places, but maybe not in all (two IMULs means needing an
8 cycles gap between where the input became available and where you use
the result).


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.