Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 27 Jun 2013 11:07:34 -0400
From: Yaniv Sapir <yaniv@...pteva.com>
To: john-dev@...ts.openwall.com
Subject: Re: Parallella: bcrypt

Katja

Can you please post the following:

1. C source used to generate this assembly,
2. The compilation command,
3. The caller - what parameters you used in the function call.

The code itself looks OK on the surface - no immediate problems that I can
identify from a glance, but I actually need to know how it was generated.
Using ADD or IADD by itself should not make a huge difference, but IADD may
leave space for some optimization.

Thanks,
Yaniv.


On Thu, Jun 27, 2013 at 10:54 AM, Katja Malvoni <kmalvoni@...il.com> wrote:

> Hi Alexander,
>
> On Thu, Jun 27, 2013 at 4:40 PM, Solar Designer <solar@...nwall.com>wrote:
>
>> Katja,
>>
>> On Mon, Jun 24, 2013 at 04:54:45PM +0200, Katja Malvoni wrote:
>> > On Tue, May 28, 2013 at 1:58 AM, Solar Designer <solar@...nwall.com>
>> wrote:
>> > > On Sun, May 26, 2013 at 07:37:55PM -0400, Yaniv Sapir wrote:
>> > > > -mfp-mode=int        # this sets the FPU mode to integer. However,
>> please
>> > > > make sure that the generated code does not re-program the CONFIG
>> register
>> > > > before every integer operation
>> > >
>> > > Let's definitely try this.  I was afraid we'd have to resort to
>> assembly
>> > > code to use the FPU in integer mode - it's great news to me that we
>> seem
>> > > not to have to.
>> >
>> > Unfortunately, this doesn't help a lot... Execution speed with -02 is
>> > 45.969000 ms and with -mfp-mode=int is 45.951000 ms. I checked generated
>> > assembly code it seems that CONFIG register isn't re-programmed before
>> > every integer operation.
>>
>> ... but are there uses of the IADD instruction (the one implemented on
>> the FPU) at all, or only plain ADD (the one implemented on IALU)?
>>
>
> In whole disassembly only ADD is used.
>
>
>>
>> Can you show us a piece of disassembly - e.g., for one Blowfish round?
>>
>>
> Here it is:
> 00000234 <_BF_encrypt>:
>  234:    d54c 4400     ldr r22,[sp,+0x2]
>  238:    a01b 4009     add r21,r0,72
>  23c:    1feb 4002     mov r16,0xff
>  240:    20ef 4002     mov r17,r0
>  244:    854c 2a00     ldr r12,[r17],+0x2
>  248:    860f 208a     eor r12,r1,r12
>  24c:    920f 4406     lsr r20,r12,0x10
>  250:    510f 4406     lsr r18,r12,0x8
>  254:    330f 0406     lsr r1,r12,0x18
>  258:    905f 490a     and r20,r20,r16
>  25c:    485f 490a     and r18,r18,r16
>  260:    911b 4822     add r20,r20,274
>  264:    251b 0002     add r1,r1,18
>  268:    705f 450a     and r19,r12,r16
>  26c:    905f 4806     lsl r20,r20,0x2
>  270:    2456          lsl r1,r1,0x2
>  272:    491b 4842     add r18,r18,530
>  276:    485f 4806     lsl r18,r18,0x2
>  27a:    6d1b 4862     add r19,r19,786
>  27e:    8249 4100     ldr r20,[r0,+r20]
>  282:    20c1          ldr r1,[r0,r1]
>  284:    6c5f 4806     lsl r19,r19,0x2
>  288:    4149 4100     ldr r18,[r0,+r18]
>  28c:    309f 080a     add r1,r20,r1
>  290:    61c9 4100     ldr r19,[r0,+r19]
>  294:    250f 010a     eor r1,r1,r18
>  298:    44cc 4900     ldr r18,[r17,-0x1]
>  29c:    259f 010a     add r1,r1,r19
>  2a0:    250f 010a     eor r1,r1,r18
>  2a4:    488a          eor r2,r2,r1
>  2a6:    6a0f 4006     lsr r19,r2,0x10
>  2aa:    490f 4006     lsr r18,r2,0x8
>  2ae:    6c5f 490a     and r19,r19,r16
>  2b2:    2b06          lsr r1,r2,0x18
>  2b4:    485f 490a     and r18,r18,r16
>  2b8:    6d1b 4822     add r19,r19,274
>  2bc:    251b 0002     add r1,r1,18
>  2c0:    6c5f 4806     lsl r19,r19,0x2
>  2c4:    2456          lsl r1,r1,0x2
>  2c6:    491b 4842     add r18,r18,530
>  2ca:    485f 4806     lsl r18,r18,0x2
>  2ce:    61c9 4100     ldr r19,[r0,+r19]
>  2d2:    20c1          ldr r1,[r0,r1]
>  2d4:    4149 4100     ldr r18,[r0,+r18]
>  2d8:    2c9f 080a     add r1,r19,r1
>  2dc:    250f 010a     eor r1,r1,r18
>  2e0:    485f 410a     and r18,r2,r16
>  2e4:    491b 4862     add r18,r18,786
>  2e8:    485f 4806     lsl r18,r18,0x2
>  2ec:    6149 4100     ldr r19,[r0,+r18]
>  2f0:    454c 4a00     ldr r18,[r17],+0x2
>  2f4:    259f 010a     add r1,r1,r19
>  2f8:    250f 010a     eor r1,r1,r18
>  2fc:    908f 240a     eor r12,r12,r1
>  300:    26bf 090a     sub r1,r17,r21
>  304:    a410          bne 24c <_BF_encrypt+0x18>
>  306:    20cc 0002     ldr r1,[r0,+0x11]
>  30a:    8cdc 2000     str r12,[r3,+0x1]
>  30e:    288a          eor r1,r2,r1
>  310:    2c54          str r1,[r3]
>  312:    6c1b 0001     add r3,r3,8
>  316:    59bf 080a     sub r2,r22,r3
>  31a:    50ef 0402     mov r2,r12
>  31e:    9120          bgtu 240 <_BF_encrypt+0xc>
>  320:    04e2          mov r0,r1
>  322:    194f 0402     rts
>  326:    01a2          nop
>
>
> Execution time when using all 16 cores is 294.676000 ms
>
> Katja
>



-- 
===========================================================
Yaniv Sapir
Adapteva Inc.
1666 Massachusetts Ave, Suite 14
Lexington, MA 02420
Phone: (781)-328-0513 (x104)
Email: yaniv@...pteva.com
Web: www.adapteva.com
============================================================
CONFIDENTIALITY NOTICE: This e-mail may contain information
that is confidential and proprietary to Adapteva, and Adapteva hereby
designates the information in this e-mail as confidential. The information
is
 intended only for the use of the individual or entity named above. If you
are
not the intended recipient, you are hereby notified that any disclosure,
copying,
distribution or use of any of the information contained in this
transmission is
strictly prohibited and that you should immediately destroy this e-mail and
its
contents and notify Adapteva.
==============================================================

Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.