Date: Tue, 30 Jul 2013 18:33:11 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Parallella: bcrypt Katja, On Tue, Jul 30, 2013 at 02:18:04PM +0200, Katja Malvoni wrote: > On Tue, Jul 30, 2013 at 2:47 AM, Solar Designer <solar@...nwall.com> wrote: > > > Perhaps you should change your code to transferring just one struct? > > I wouldn't be surprised if this gives us a few c/s extra. > > Done. Any change in c/s rate? BTW, you can probably do smarter: when you have hashes with multiple salts loaded for cracking, usually the candidate passwords stay the same across crypt_all() calls until they've been tested for all salts (so the salt changes across those calls). You can optimize for this special case, e.g. by maintaining a keys_changed variable in your format (e.g., DES_bs* files use a variable like this) and only transferring the candidate passwords to Epiphany when they have changed. There's another special case: when cracking hashes with just one salt (which often means that you have just one hash loaded, although this is not necessarily so), the salt stays the same across crypt_all() calls (only the candidate passwords change), so you can save on not transferring the unchanged salt. So it may be more optimal to have exactly two structs: one for the salt and one for the candidate passwords - and transfer only those of them that have changed since the previous transfer. You set the keys_changed flag in set_key() and the salt_changed flag in set_salt(), and you reset both in crypt_all(). I'm sorry it did not occur to me to suggest this to you before. > When I do test with BF_tst.in speed is 727 c/s. It seems that interleaving > is not used (but even without interleaving it shouldn't be this slow). Your code can't magically turn into a non-interleaved version. Rather, when there are too few inputs to fully use the 32 "slots", fewer of the slots will be made effective use of (and counted for c/s), so a speed worse than you had without interleaving is to be expected for the last set of candidate passwords to be tested (as long as the number of candidate passwords is not a multiple of 32). This does mean that increasing interleaving hurts performance for very short runs of the program, while improving performance for long runs. How many candidate passwords are you testing? > Self > test on same code gives speed of 1175 or 1177 c/s. MAX_KEYS_PER_CRYPT is > defined as EPIPHANY_CORES*2 so every crypt_all() call should compute 32 > hashes? Yes. BTW, when you #define something to an expression (rather than a literal constant), it is considered good style to enclose the entire expression in braces. That way, you will avoid potential bugs when you or someone else later happens to use your #define'd non-literal constant in an expression. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.