Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 27 Jun 2013 11:35:05 -0400
From: Yaniv Sapir <yaniv@...pteva.com>
To: john-dev@...ts.openwall.com
Subject: Re: Parallella: bcrypt

FWIW, looking at the disassembly, it seems like the loop there spans over
~50 instructions, which means that it takes the order of magnitude of 50
cycles (probably some stalls due to dependency are balanced with
instruction multi-issue). If this is right, then unrolling the loop further
will add marginal gain - as the branch penalty is 4-5 cycles.

On Thu, Jun 27, 2013 at 11:15 AM, Solar Designer <solar@...nwall.com> wrote:

> Katja,
>
> On Thu, Jun 27, 2013 at 04:54:31PM +0200, Katja Malvoni wrote:
> >  26c:    905f 4806     lsl r20,r20,0x2
> >  270:    2456          lsl r1,r1,0x2
>
> We should be able to avoid needing these instructions if you pick the
> version of the BF_ROUND macro that's intended for archs without scaled
> index on loads.  The crypt_blowfish.c file in musl doesn't include it,
> so you'll need to take it from our separate crypt_blowfish distribution.
>
> In fact, here it is:
>
> /* Architectures with no complicated addressing modes supported */
> #define BF_INDEX(S, i) \
>         (*((BF_word *)(((unsigned char *)S) + (i))))
> #define BF_ROUND(L, R, N) \
>         tmp1 = L & 0xFF; \
>         tmp1 <<= 2; \
>         tmp2 = L >> 6; \
>         tmp2 &= 0x3FC; \
>         tmp3 = L >> 14; \
>         tmp3 &= 0x3FC; \
>         tmp4 = L >> 22; \
>         tmp4 &= 0x3FC; \
>         tmp1 = BF_INDEX(data.ctx.S[3], tmp1); \
>         tmp2 = BF_INDEX(data.ctx.S[2], tmp2); \
>         tmp3 = BF_INDEX(data.ctx.S[1], tmp3); \
>         tmp3 += BF_INDEX(data.ctx.S[0], tmp4); \
>         tmp3 ^= tmp2; \
>         R ^= data.ctx.P[N + 1]; \
>         tmp3 += tmp1; \
>         R ^= tmp3;
>
> Another optimization to try is unrolling more rounds.  The loop in
> musl's BF_encrypt() unrolls only two rounds, but it has that "#if 0"
> block with all 16 unrolled - would the code still fit if you change it
> to "#if 1"?  Perhaps it would.
>
> Alexander
>



-- 
===========================================================
Yaniv Sapir
Adapteva Inc.
1666 Massachusetts Ave, Suite 14
Lexington, MA 02420
Phone: (781)-328-0513 (x104)
Email: yaniv@...pteva.com
Web: www.adapteva.com
============================================================
CONFIDENTIALITY NOTICE: This e-mail may contain information
that is confidential and proprietary to Adapteva, and Adapteva hereby
designates the information in this e-mail as confidential. The information
is
 intended only for the use of the individual or entity named above. If you
are
not the intended recipient, you are hereby notified that any disclosure,
copying,
distribution or use of any of the information contained in this
transmission is
strictly prohibited and that you should immediately destroy this e-mail and
its
contents and notify Adapteva.
==============================================================

Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.