Date: Thu, 1 Aug 2013 17:15:12 +0200 From: Katja Malvoni <kmalvoni@...il.com> To: john-dev@...ts.openwall.com Subject: Re: Parallella: bcrypt Hi Alexander, On Wed, Jul 31, 2013 at 4:06 AM, Solar Designer <solar@...nwall.com> wrote: > > On Tue, Jul 30, 2013 at 4:19 PM, Solar Designer <solar@...nwall.com> > wrote: > > > Here's another idea: replace the AND, not the right shift. You can > > > replace one AND with two IMULs - e.g., to extract the byte at bit > offset > > > 16, you can IMUL by 0x100, then right shift by 24, then IMUL by 4 (to > > > get the 8 data bits into bit offsets 2 to 9 as we need for a load). > Can > > > you have both IMULs for free with 2x interleave, or would you have to > go > > > for 3x? In the latter case, you wouldn't be able to preload one of > > > three P arrays, which would defeat the purpose of this new trick for > one > > > of two byte extracts - but we'd nevertheless potentially save a cycle > on > > > the other byte extract. > [...] > Perhaps you can try implementing the trick for the byte at bit offsets > 16 to 23, where you can reuse the existing 0xff constant with IMADD? > Unfortunately, problem with IMADD appears again. Register used as result is added to multiplication so I need to move L to tmp first and than use the trick. Or use another register for constant but in that case I can't fully preload both P arrays. Katja Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.