john-users - Re: Splitting the Charset into chunks

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070310021538.GB19506@openwall.com>
Date: Sat, 10 Mar 2007 05:15:38 +0300
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: Splitting the Charset into chunks

Pablo,

On Wed, Mar 07, 2007 at 12:47:27PM +0100, Pablo Fernandez wrote:
> So, if the charset is not to be divided into millions of finite jobs, then 
> I'll go for a discrete number of infinite jobs.

That would be totally unreasonable because it would imply that all jobs
but the first one would be trying very unlikely passwords only.
Luckily, as I had mentioned, there are a few hundred of reasonably sized
chunks (not too small and not too large) that you can obtain by hacking
.chr files.  Things will be somewhat unoptimal in terms of the order of
candidate passwords tried, but this won't be totally unreasonable.

No, I do not recommend this approach, but you've asked for it and it's
the best you can get by hacking .chr files and not touching the code.

> I've seen an External rule to 
> parallelize already built in the examples (External:Parallel), but it seems to 
> me rather inefficient: it basically calculates if the word to be checked is 
> for him, and if it's not skip it, so if we have 100 nodes, each node checks 
> one word and skips 99.

You're correct, it won't scale well enough for fast hashes with no or a
small number of different salts.  For slow hashes and/or many different
salts, it might work OK.  Its other problem is that it might misbehave
after an interrupted session is recovered.  This may be addressed by
computing a checksum of the word[] instead of using the word number -
Otheus did this in his MPI and non-MPI patches (please search the list
archives).

> I've tested how ineficient it is, and it came out to be quite ok. For 1000 
> nodes there's a 25% loss on every node, for 250 there's an 8%. I've tried 
> with more numbers, and came out to this relation: 0.030% loss in every node 
> for each node, 0.025% above 1000 nodes, and 0.008% above 10000 nodes probably 
> thanks to the cache :)
> 
> Those seem to be good numbers, but they're not in a very big cluster. If 
> you've got 10000 machines you loose 80% of the power in every machine looking 
> for the next word to try. So, I think I'll try to write code for a small 
> cluster (let's say 300 nodes with 10% loss... after all, having access to 10k 
> machines is not that easy xD, but still 10% is big) unless anybody knows a 
> better way to split the charset.

The above numbers don't mean anything without you also mentioning the
hash type and number of different salts.

> Does 
> anybody know a better way to skip a given number of words in a row from an 
> external filter? I guess it's impossible so fas as I understand it.

There's no better way to do it from an external filter().

> Solar, you spoke about a way to split a charset that's not skipping some 
> order[] entries in inc.c, can you tell me about it?

Yes - you can simply omit those order[] entries from the .chr files and
move the following entries (that you'd like to leave in, if any) to
lower indices.  Then fill the remainder of order[] with entries that use
valid but out of range (for your john.conf settings) values for length
or count.

To me, this is a dirtier hack than patching inc.c to skip some order[]
entries would be.

-- 
Alexander Peslyak <solar at openwall.com>
GPG key ID: 5B341F15  fp: B3FB 63F4 D7A3 BCCC 6F6E  FC55 A2FC 027C 5B34 1F15
http://www.openwall.com - bringing security into open computing environments

Was I helpful?  Please give your feedback here: http://rate.affero.net/solar

-- 
To unsubscribe, e-mail john-users-unsubscribe@...ts.openwall.com and reply
to the automated confirmation request that will be sent to you.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.