john-users - Splitting mask keyspace

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFs9wnW772xxJfMm+7qBvnuvcBpEiFj-wJpVArCLqxw+iKy6Rw@mail.gmail.com>
Date: Tue, 2 Mar 2021 22:44:06 +0100
From: Michał Majchrowicz <sectroyer@...il.com>
To: john-users@...ts.openwall.com
Subject: Splitting mask keyspace

I have been working on distributing computations in my development
environment consisting of few heterogenous mode. I am interested in
optimising distribution on tasks in such systems. More info and
previous discussion can be found here:
https://github.com/openwall/john/issues/4576
I don't have any issues with dict based attacks as I have very brute
but efficient approach. I simply split dict by number of lines
corresponding to nodes computational power. However I didn't have such
granulity when splitting mask attack (I was splitting by first
character of keyspace). However Solar Designer suggest using --node
switch witch required some changes in my system but in the end it
worked and finally all nodes were finishing computaitons in close to
each other eta. Here is sample task split:

Total power percentage of john nodes: 41.11%
Total john keyspace: w-9

john1 power: 7.57% node: 1-757/4111
john2 power: 7.15% node: 758-1472/4111
john3 power: 26.39% node: 1473-4111/4111

I have run this configuration in multiple scenarios. As john often
needs even few hours to calculate ETA I prefer to wait for task to be
finished and re-run it with different settings later. As a result even
tough I am mostly prototyping this on des or md5 it takes a lot of
time to properly test :)

I thought this issue was closed however Solar Designer suggested that
I might get much better results if I use --fork option rather than
openmp. On one node that has 4 physical cores and 8 virtual ones
change wasn't really significant. I got around 40MH/s instead of
36-37MH/s with openmp. Since --fork changes the way I would have to
handle john output (parsing multiple lines instead of one simple
status) it wasn't worth to be changed. BUT, there is always but ;) On
a 32 core machine change was much more significant. From around
125MH/s it got up to around 316MH/s which surprised me a lot. This is
really worth investigating however --node is "connected" to --fork so
my task splitting approach would have to be changed significantly. In
order to cover potential computational power of 32 core node.
Therefore I would have to scale fork/node up to 72. I did quick run
and I got around 309MH/s but it requires more longer tests to see if
such big number of forks won't have negative impact on speed and now
hard to parse 32 fork output became 72 :D Just to clarify I understand
that those results might be (and I expect them to be) completely
different on other hashes and my system is designed with that in mind
but I want to have a reliable and efficient way of splitting mask
attack tasks that will be the foundation of future improvements.
Please let me know if you have any suggestions on what can be changed
or improved :)
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.