john-dev - Re: Parallella: bcrypt

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130731222244.GA31179@openwall.com>
Date: Thu, 1 Aug 2013 02:22:44 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: Parallella: bcrypt

Hi Katja,

On Wed, Jul 31, 2013 at 11:25:21PM +0200, Katja Malvoni wrote:
> Alexander, I'm currently working on everything you mentioned so far. I'm
> using cpp macros, P arrays are preloaded using ldrd, str instructions
> replaced by strd with post increment and rts is added.

Cool.

> I'm using two
> structs but both are still in the same struct called shared_buffer. And I
> have an interesting situation - I have a code which isn't reliable
> (sometimes fails self test), but when it works I get very weird speed:
> 497619 c/s (it's not constant but it's 49xxxx, both real and virtual). I am
> testing bcrypt-parallella format, I only changed how data is transferred
> and how result is read (separated structs for input and output, I still
> haven't implemented savings when salt or keys aren't changed). I don't
> understand this speed. If I measure time with transfers it's around 0.05
> ms. But with unoptimized bcrypt, speed of computing the hash without
> transfers was around 16.5 ms. If I read whole outputs struct and than use
> memcpy to have result in parallella_BF_out speed is 1204 c/s. Code which
> gives this very high speed is committed.

I guess this line:

		buff.out.core_done[corenum] = 0;

is not executing or does not take effect (as far as the host is aware)
soon enough.  You appear to have a race condition here, and it appears
to be triggered 100% of the time now.  I guess you need to be resetting
core_done[corenum] to 0 from the host, not from Epiphany.

Alexander

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.