john-users - Re: Splitting mask keyspace

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210303114926.GA2157@openwall.com>
Date: Wed, 3 Mar 2021 12:49:26 +0100
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: Splitting mask keyspace

On Tue, Mar 02, 2021 at 10:44:06PM +0100, Micha?? Majchrowicz wrote:
> john often needs even few hours to calculate ETA

That's weird.  You might want to open a GitHub issue for it, and include
reproduction instructions in there (preferably with a reduced testcase).

(I don't know if what you mean here is the same or a different issue
from the GitHub issue on disappearing ETA that you've already opened.)

> I thought this issue was closed however Solar Designer suggested that
> I might get much better results if I use --fork option rather than
> openmp. On one node that has 4 physical cores and 8 virtual ones
> change wasn't really significant. I got around 40MH/s instead of
> 36-37MH/s with openmp. Since --fork changes the way I would have to
> handle john output (parsing multiple lines instead of one simple
> status) it wasn't worth to be changed. BUT, there is always but ;) On
> a 32 core machine change was much more significant. From around
> 125MH/s it got up to around 316MH/s which surprised me a lot.

There are a few reasons why OpenMP could be this much slower than fork:

1. There might be other load on the system.  You mentioned elsewhere
that you'd rather not use all cores of the machine, and I am guessing
that other load might be why not.  OpenMP is extremely sensitive to
other load ("--fork" is not).  When you use OpenMP on a system with
other load, you need to set the OMP_NUM_THREADS environment variable to
a thread count low enough that the system isn't overbooked.

2. Your different salt count might be low.  With "--fork", each process
generates its own candidate passwords stream.  With OpenMP, just one
thread generates candidate passwords for all threads to use, and this is
synchronous.  The generated candidate passwords are reused for all
salts, so the more salts you have the lower the candidate password
generation "overhead" is.  You might want to see what the speed ratio
would be with a higher different salt count.  (Some numbers you posted
on GitHub suggest you were running against only 5 descrypt salts.  Try
running against a few thousand, up to the maximum of 4096.)

3. I assume you're already using the latest bleeding-jumbo off GitHub
and not our 1.9.0-jumbo-1 release?  I made some enhancements to
descrypt since 1.9.0-jumbo-1, bringing the comparisons of computed vs.
loaded hashes from sequential into OpenMP parallel sections.  So with
bleeding-jumbo you should have higher descrypt OpenMP speeds than with
1.9.0-jumbo-1 (but not as high as what "--fork" can provide, indeed).

Alexander

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.