john-users - Re: automation equipped working place of hash cracker, proposal

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BLU0-SMTP390A9CAB838F03BB2B3B92BFD3C0@phx.gbl>
Date: Wed, 18 Apr 2012 23:35:23 +0200
From: Frank Dittrich <frank_dittrich@...mail.com>
To: john-users@...ts.openwall.com
Subject: Re: automation equipped working place of hash cracker,
 proposal

Hi Aleksey,

thanks for sharing your thoughts.

On 04/18/2012 10:27 PM, Aleksey Cherepanov wrote:
> On Mon, Apr 16, 2012 at 10:52:30AM +0200, Simon Marechal wrote:
>> If I was to design this, I would do it that way :
>> * the server converts high level demands into low level job units
>> * the server has at least a network API, and possibly a web interface
>> * the server handles dispatching
> 
> I think the easiest way to split cracking task into parts for distribution is
> to split candidates list, to granulate it: we run our underlying attack
> command with '--stdout', split it into some packs and distribute that packs to
> nodes that will just use packs as wordlists. Pros: it is easy to implement, it
> is flexible and upgradabl, it supports modes that we don't want run to the end
> like incremental mode, all attacks could paralleled as such (if I am not
> wrong). Cons: it seems to be suboptimal, it does not scale well (candidates
> generation could become bottleneck, though it could distributed too),

I'm afraid network bandwidth will soon become a bottleneck, especially
for fast saltless hashes.

Here's another idea for splitting tasks:

The task will already be split into several parts, because there will
probably be password hashes of different formats.
Some hash formats will better be cracked using GPU, while others will
probably better be distributed for CPU cracking, to make the best use of
the available hardware.
For fast hashes, the strategy will probably not the same as for slow hashes.

If we do have more clients than available hash formats, the tasks must
either be split by splitting the files containing hashes into smaller
parts, so that several clients try to crack different hashes of the same
format, or by letting different clients run different cracking sessions.

Splitting input files with password hashes only makes sense for salted
hashes, and may be it shouldn't even be done for fast salted hashes.
If we split the files, we have to make sure that different hashes for
the same salt will not be spread across different files.
If some salts appear more frequently than others, we should split the
salts into different files according to the numb er of hashes per salt.
This way, we can try more rules or the same set of rules, but on larger
word lists, for those salts which occur many times.

Distributing tasks to different clients without transferring password
candidates probably requires that the clients use the same john version,
and also use a common set of word lists which can be distributed prior
to the contest.
If later on we realize that we need additional word lists or new chr
files (or stats files for markov mode), we could either implement a way
to distribute those new files as well, or the tasks which use these new
files have to be distributed among a smaller set of clients with read
access to a directory on the central server.)

Then, you could distribute tasks by generating small config files just
for a particular task, and by transferring the config file and the
command line to be used by the client.
That way, the amount of data that has to be transferred from the server
to the clients should be much smaller compared to generating and
distributing lists of password candidates.

If you want to distribute a task which requires running incremental mode
for a certain time, even that should work.
IIRC. magnum implemented an option to specify how long a cracking
session should be run before it gets interrupted automatically.
Just make sure the client also returns the resulting .rec file to the
server, so that it can be reused by the next client which continues the
incremental mode session should we run out of other tasks to try.

> while it
> would be easy to implement recheck for results for untrusted nodes (contest
> environment) 

I think the check we did during the last contest (verify that the
passwords in john.pot files transferred to the central server really
crack the corresponding hashes) is OK, more is probably not needed.
(If we detect a malicious client which reported wrong results, we can
still schedule the tasks that were executed on this client on another
client.)

> it would hard to hide sensitive data from them (real life),

I think that for a real-life scenario can assume a secure network
connections and secure clients. If you have to worry about how well
secured your clients are, better don't distribute any tasks to them.

> it does not respect work that is already done (by other distribution
> projects).

Avoiding duplicate work that has been done by clients which just
cooperate with the server on a lower level (e.g., by just dumping
john.pot files with cracked hashes into a directory) will be very hard,
if not impossible. Better don't waste your energy here.


If I can think of a useful reply to the rest of your mail, it will be in
a separate mail. But probably not today.


Frank
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.