Date: Tue, 30 Jul 2013 22:57:09 +0200 From: Katja Malvoni <kmalvoni@...il.com> To: john-dev@...ts.openwall.com Subject: Re: Parallella: bcrypt Hello, I implemeted preload of second P array, code is committed. I got one register by reusing one of temporaries and I got another one by changing two offsets for one ptr. I'm getting 1192 c/s. I expected higher speed and I think this is because something is not dual-issued for the second instance second BF_ROUND in macro. At the end of the macro, load from P array was ensuring 4 cycles separation between iadd and corresponding eor. I still haven't figured out what is not dual-issued and why. Katja Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.