john-dev - Re: Parallella: bcrypt

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130731030733.GA23266@openwall.com>
Date: Wed, 31 Jul 2013 07:07:33 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: Parallella: bcrypt

Katja,

On Wed, Jul 31, 2013 at 06:11:51AM +0400, Solar Designer wrote:
> In fact, since at least one of "salt" or "keys" should change between
> crypt_all() calls almost all of the time, and since we prefer to do just
> one transfer, we will always be transferring the "salt".  Thus, the
> salt_changed flag's only use is to save on reading the salt from
> external memory on the Epiphany side if the salt has not changed.  Other
> than that, you can always transfer the salt from the host.

Note that with current JtR interfaces there's only one "salt" for all
cores.  Right now, your code unnecessarily uses per-core (and even
per-instance!) salts.  You can save on only transferring and loading one
salt from external memory (and having other cores read that one salt
from one core's local memory).

Also, set_key() in parallella_bf_fmt.c is simple enough that you can
have it on Epiphany, thereby reducing the data transfer size and having
the 32 instances of set_key() execute in parallel.  (On host, you will
then use simpler set_key() that will merely buffer the data, without
expansion to the 72-char strings).

You combining both inputs and outputs in one struct is weird/wrong.
I think you should have separate struct's for transfer to Epiphany (one
struct) and for transfer of the results back (another struct).  Right
now, you have:

typedef struct
{
        BF_binary result[MAX_KEYS_PER_CRYPT];
        int core_done[EPIPHANY_CORES];
        BF_key init_key[MAX_KEYS_PER_CRYPT];
        BF_key exp_key[MAX_KEYS_PER_CRYPT];
        BF_salt setting[EPIPHANY_CORES];
        int start[EPIPHANY_CORES];
}data;

Its size is something like:

8*32+4*16+72*2*32+(16+4+4)*16+4*16 = 5376 bytes

You can instead have:

typedef struct {
	BF_word salt[4];
	unsigned char rounds;
	unsigned char flags; /* bit 0 keys_changed, bit 1 salt_changed */
	unsigned char subtype; /* only if you move set_key() to Epiphany */
	int start1[EPIPHANY_CORES];
	BF_key init_key[MAX_KEYS_PER_CRYPT];
	int start2[EPIPHANY_CORES];
} inputs;

2452 bytes, and only 80 bytes are transferred most of the time (when
keys are not changed, and thus you can exclude init_key).  I added
start2[] so that you can have the start flags at the end of the struct
(confirming its full transfer) when you do transfer the keys.

typedef struct {
	BF_binary result[MAX_KEYS_PER_CRYPT];
	int core_done[EPIPHANY_CORES];
} outputs;

320 bytes.

BTW, in parallella_e_bcrypt.c you currently have the unused array
flags_by_subtype[] - just drop it.

Alexander

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.