Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 1 Aug 2013 02:22:44 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: Parallella: bcrypt

Hi Katja,

On Wed, Jul 31, 2013 at 11:25:21PM +0200, Katja Malvoni wrote:
> Alexander, I'm currently working on everything you mentioned so far. I'm
> using cpp macros, P arrays are preloaded using ldrd, str instructions
> replaced by strd with post increment and rts is added.

Cool.

> I'm using two
> structs but both are still in the same struct called shared_buffer. And I
> have an interesting situation - I have a code which isn't reliable
> (sometimes fails self test), but when it works I get very weird speed:
> 497619 c/s (it's not constant but it's 49xxxx, both real and virtual). I am
> testing bcrypt-parallella format, I only changed how data is transferred
> and how result is read (separated structs for input and output, I still
> haven't implemented savings when salt or keys aren't changed). I don't
> understand this speed. If I measure time with transfers it's around 0.05
> ms. But with unoptimized bcrypt, speed of computing the hash without
> transfers was around 16.5 ms. If I read whole outputs struct and than use
> memcpy to have result in parallella_BF_out speed is 1204 c/s. Code which
> gives this very high speed is committed.

I guess this line:

		buff.out.core_done[corenum] = 0;

is not executing or does not take effect (as far as the host is aware)
soon enough.  You appear to have a race condition here, and it appears
to be triggered 100% of the time now.  I guess you need to be resetting
core_done[corenum] to 0 from the host, not from Epiphany.

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.