Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 31 Jul 2013 06:37:01 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: Parallella: bcrypt

Katja,

On Wed, Jul 31, 2013 at 06:19:08AM +0400, Solar Designer wrote:
> Another thing I noticed is that you're not yet using LDRD to preload P's
> (instead, you preload the elements one by one).  I think you should.

You may need to adjust the alignment of P for that, to make P[1] rather
than P[0] quadword aligned.

Also, you may use STRD and likely its form with post-increment in these
pieces of code:

                str r22, [r2]
                str r48, [r2, +0x1]

                str r23, [r2, +r52]
                str r12, [r2, +r53]
                iadd r2, r2, r59

                str r22, [r2]
                str r48, [r2, +0x1]

                str r23, [r52]
                str r12, [r52, +0x1]
                add r2, r2, 8
                add r52, r52, 8

You just need to allocate adjacent registers (with the first of them
being even-numbered).  Instead of your current use of r52 and r53,
you'll need just one of them (the instruction will increment it).
The IADD and the ADDs will be gone.

Or am I missing something?

To make maintenance of this code easier, I suggest that you move to .S
and cpp macros, and add plenty of #define's with descriptive names for
the registers.

Alexander

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ