Date: Thu, 25 Jul 2013 13:26:19 +0200 From: Katja Malvoni <kmalvoni@...il.com> To: john-dev@...ts.openwall.com Subject: Re: Parallella: bcrypt Hi Alexander, On Thu, Jul 25, 2013 at 3:36 AM, Solar Designer <solar@...nwall.com> wrote: > Hi Katja, > > On Wed, Jul 24, 2013 at 11:42:53PM +0200, Katja Malvoni wrote: > > I made use of dual-issue, the speed I'm getting is 976 c/s when compiling > > Epiphany code with -O2. If I compile with -O3 I get 979 c/s. > > This is nice. This is for 1 instance of bcrypt per core per invocation, > right? I mean that there's no interleaving yet. > That's right, only one instance. > Can you try interleaving two instances, perhaps with C code initially? > Ok, I will. > > > Code is in https://github.com/kmalvoni/JohnTheRipper/tree/master > > I took a look, and surprisingly (besides the pieces of inline asm) I > noticed something unrelated: you seem to have inconsistent BF_binary > sizes between Epiphany and host sides. I thought you had addressed that > already? Maybe you forgot to commit? Also, your host side code only > checks 32 bits of the computed hash value, whereas you could check 64 > bits just as easily (so you should). > I had problems with my local github repo and I wasn't able to commit so I edited files on GitHub online. That was a very bad idea... I forgot to update host code and Makefile. I won't repeat this again and I apologize for inconvenience. On Thu, Jul 25, 2013 at 4:28 AM, Solar Designer <solar@...nwall.com> wrote: I checked out, built, and tried to test this version of code. The first hurdle was the 2 vs. 6 size BF_binary discrepancy. Because of it, the program would just get stuck all the time. Once I fixed it in my copy of parallella_bf_fmt.c, I am getting: solar@...aro-ubuntu-desktop:~/ > > 2/JohnTheRipper/run$ ./parallella_john.sh -te -form=bcrypt-parallella > Benchmarking: bcrypt-parallella, OpenBSD Blowfish ("$2a$05", 32 > iterations) [Parallella]... DONE > Raw: 865 c/s real, 865 c/s virtual > > ... which is much less than what you said it would be. > > So perhaps you forgot to commit multiple changes? This is because fast.ldf is used in Makefile instead of internal.ldf. Now everything should work. On Thu, Jul 25, 2013 at 4:02 AM, Solar Designer <solar@...nwall.com> wrote: > The code itself mostly looks good to me (including your delayed use of > results from IMADD and IADD). Shouldn't you re-order these two, though? - > > | "eor %0, %0, r27\n" \ > | "eor r23, r22, r23\n" \ > > because r22 is loaded sooner than r27? Well, maybe this makes no > difference on the current chip, but it might if load's latency is > increased in a future revision of Epiphany. > If I reorder them than there is no 4 cycles separation between iadd r23, r24, r23 and eor r23, r22, r23 and that's required for dual-issue. In that case, speed is 924 c/s. > Now, here's an issue/bug in the above: you rely on registers being > preserved across multiple pieces of inline asm, but gcc does not > guarantee you that. Also, you don't declare which registers you > clobber. To fix this, your BF_ROUND should not be the entire __asm__ > block, but rather just a portion of the string you put inside such > block. The asm block itself, with proper confession on what registers > you clobber, should be in the BF_encrypt function. > When I did that, e-gcc unnecessary used one more register to store L and register being used changed for every BF_ROUND. And than there were 16 unnecessary mov instructions. So I removed clobbered registers list. I added them back now, speed drops from 976 c/s to 970 c/s. On Thu, Jul 25, 2013 at 7:18 AM, Solar Designer <solar@...nwall.com> wrote: > On Thu, Jul 25, 2013 at 06:02:52AM +0400, Solar Designer wrote: > > | "ldr r27, [r45], 0x1\n" \ > > I guess this is read from the P-box. You should be able to use ldrd > here, and thus only have this instruction in every other round (a total > of 9 instructions to read the 18 elements). Don't forget that ldrd > needs an even-numbered first register. > This instruction ensures 4 cycles separation between IADD r23, r24, r23 and EOR r23, r22, r23, if I remove it, I'll lose dual-issue in one round. But I'll try to reorder instructions so that dual-issue stays. Katja Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.