john-users - Re: External filter question

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110912172729.GA3681@openwall.com>
Date: Mon, 12 Sep 2011 21:27:29 +0400
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: External filter question

On Mon, Sep 12, 2011 at 10:35:09AM +0200, Pablo Fernandez wrote:
> Actually, I have been making myself some numbers, and it turns out to be 
> better than Parallel, given the same conditions: jobs will have a limited 
> duration.
> The reason for being better is because each job only "skips" until it's its 
> turn, and then finishes. On the Parallel all jobs "skip" all the time. For 
> example, with blocks, given you want to test the first 10k passwords with 10 
> jobs:
> - First jobs doesn't skip. It computes 1k passwords and exits
> - Second job skips 1k, computes 1k, exits
> - Last job skips 9k, computes 1k, exits.
> All in all, time spent skipping: 0+1+2+3+..+9 = 45k skips.
> The same, with Parallel, 10k passwords, 10 jobs:
> - All jobs compute 1k, and skip 9k.
> All in all, time spent skipping: 10*9 = 90k skips.
> 
> Does this make sense?

Yes, but it works this way only for the very first batch of jobs.  Once
your nodes are done with those first 10k passwords, if you want to test
the next 10k, you incur 145k skips for those, or 190k total.  This is
greater than 180k total skips you'd incur for 20k passwords with Parallel.
And things will only get worse (a lot worse) for subsequent batches.

You may try to make your blocks so large that you would not need to run
a second batch.  But then you'd be testing candidate passwords in a
highly non-optimal order.  For example, with 10 blocks of 1 million
candidate passwords each, you'd have a node testing 9 millionth password
right away, but you won't approach 999,999th (nine times closer to the
start of the list because it was estimated as being a lot more probable)
until the very end.

> Indeed, yes. I wanted to make it variable-size blocks, depending on the 
> "expected" free time to use in the cluster. And also make it variable number 
> of compute nodes, you never know... so, as flexible as possible. No 
> MPI/OpenMP, by design. And Markov is a bit too strict.

Yes, I agree that there's a need for functionality like that.  I just
question the specific approach.

> Anyway, Parallel or Block are not too bad, if you have slow hashes (with Linux 
> - my target system - you barely see DES) and/or you have many salts.

Right, they may be acceptable in some cases.

> > Worse, it will also not even find some passwords, because there's not
> > only I/O buffering, but also crypto algorithm related buffering in JtR.
> 
> Is there a "safe" limit I can use? I could make the block to perform, let's 
> say, 200 more hashes than it should (if would be less than a second) even if 
> it overlaps with the next block. What I can't admit is to perform *less* than 
> it should.

Yes, you may get past the non-I/O buffering by testing extra passwords.
The minimum number of extra passwords you'd need to test varies by hash
type and John build.  For non-OpenMP and non-GPU builds, 256 is almost
always sufficient, and for some hash types and builds much smaller
numbers are sufficient (sometimes even none).  (For OpenMP or GPU, it
can be thousands.)

> > We might introduce a way for filter() to ask for process termination in
> > a later version.  In fact, maybe external mode functions should also be
> > able to ask for the status line to be displayed (simulate keypress).
> 
> Those two would be awesome to have! Any idea on how to do it? Maybe I could do 
> my work on a "patched" version,

You may try adding some of the event_* variables (see signals.h) to
ext_globals (turn it into an array) in external.c.

> pretending to work also in a future release.

The official approach might be different.

> > > - Or maybe, is there anything I can do to force John to flush the IO
> > > buffer before doing the word=1/0 operation?
> > No, not from external mode, and it's not only about the I/O anyway.
> 
> I have one question here... what happens if you do a sigkill? Does it exit 
> nicely (IO/crypt buffers)?

SIGKILL specifically kills the process at OS level, without giving it a
chance to do anything.  But if you use SIGTERM instead, then John will
terminate cleanly, flushing the I/O buffers to disk (but not necessarily
processing any buffered candidate passwords).

> Is there a way to send a message to an external application,

Not from external mode.  (Well, the external mode VM lacks bounds
checking, which a malicious external mode can make use of... but yours
should not.)

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.