Date: Tue, 3 Sep 2013 00:57:56 +0100 From: Rafael Waldo Delgado Doblas <lord.rafa@...il.com> To: john-dev@...ts.openwall.com Subject: Rafael's weekly report #12 Hello, Accomplishments: 1. Debuged epiphany-scrypt and driver-epiphany. 2. Replaced memcpy, it improve a little bit the performance. 3. Started with asm inline codification. Priorities: 1. Follow with asm codification, Well I started with inline asm. But the results weren’t better than C. This is because the C compiler optimizes the Bout vector access storing it in the general registers and after calls to R macro: 7b0: 879f e10a add r60,r1,r23 7b4: b32f fc06 lsr r61,r60,0x19 7b8: 90ff fc06 lsl r60,r60,0x7 7bc: 967f ff8a orr r60,r61,r60 7c0: 4a0f 6f8a eor r26,r26,r60 Howerver when I define the R function with inline asm: __asm__("LSR r60, %1, %3\n\tIMADD r60, %1, %4\n\t EOR %0, %1, r60" \ : "=r"(b)\ : "r"(a), "r"(b), "n"(c), "r"(d) \ : "memory", "%r60" \ ); This optimization is not performed, this adds 2 instructions more: 788: 4a4c c001 ldr r50,[r2,+0xc] 78c: 284c c000 ldr r49,[r2,+0x0] 790: 289f db0a add r49,r50,r49 794: 84ef f806 lsr r60,r49,0x7 798: 863f f887 fmadd r60,r49,r12 79c: 260f db8a eor r49,r49,r60 7a0: 2a5c c000 str r49,[r2,+0x4] BTW fmadd/imadd has cannot work with immediate values then we will need to add extra 4 movs in order to store the multiplication constants in registers. I will write a new ASM file with the salsa20_8 implementation using the C optimized version as base. Regards, Rafael. Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.