Date: Wed, 4 Sep 2013 04:54:55 +0100 From: Rafael Waldo Delgado Doblas <lord.rafa@...il.com> To: john-dev@...ts.openwall.com Subject: Re: Parallella: Litecoin mining Hello Alexander, 2013/9/4 Rafael Waldo Delgado Doblas <lord.rafa@...il.com> > In addition as you asked this is the work that I perform today: > > I implemented a couple salsa20_8 asm versions: > The first one with bucles "Bor[i] = Bout[i] = (B[i] ^ Bx[i]);" and > "Bout[i] += Bor[i];" rolled and using the instruction imadd, it save about > 250B but the performance drops almost 0.5khash/s > The second keep unrolled the bucles and uses the instruction imadd, it > save only 50B but the performance also drops almost 0.5khash/s. > > At this point looks like imadd instrucction it so slow to be used but roll > the bucle could be nice. > BTW I was debugging with e-gdb and I found that even If I use the imadd instruction the bynary image uses fmadd, this is not nice at all because use floating point math in this real scenario: r44 0x80 128 1.79366203e-43 r61 0x1e 30 4.20389539e-44 r60 0x0 0 0 fmadd r60,r61,r44 Give the next erroneous result: r44 0x80 128 1.79366203e-43 r61 0x1e 30 4.20389539e-44 r60 0x0 0 0 Instead of the correct one: r44 0x80 128 1.79366203e-43 r61 0x1e 30 4.20389539e-44 r60 0xF00 3840 ??????? There any way to use integer math? because if not there no way to get an improvement with this instruction. Regards, Rafael. Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.