john-users - Re: Can incremental mode be started in the middle?

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20200412130619.GA8243@openwall.com>
Date: Sun, 12 Apr 2020 15:06:19 +0200
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: Can incremental mode be started in the middle?

On Sat, Apr 11, 2020 at 07:50:37AM -0500, Kevin Yochum wrote:
> On Sat, Apr 11, 2020 at 5:23 AM Solar Designer <solar@...nwall.com> wrote:
> > If it's in fact random and with a uniform distribution, then e.g. an
> > all-lowercase password is just as likely as any other.
> 
> Statistically, this is correct, in the same way that the numbers 1, 2,
> 3, 4, 5, 6 are just as likely as any other lottery combination, but
> the first successive numbers have never been drawn in the history of
> any lottery system. It is possible but has never occurred.

Does this mean you're less likely to win if you choose this combination
than any other?  In a fair lottery, no.  (OTOH, if you do win with a
highly non-random-looking combination like this, then you might be more
likely to have to share your prize with other winners who also bet on
the same combination.)

> I know my problem space. These are sample known (previously cracked) passwords:
> 
> H5EWip7gw916385
> UdAjKX4vRS2mt2
> GgDjZEVsOXh0Fb
> Pz7ocjGw1liQ7u
> 6AdI1n9VYwDMOR
> G2eQNRy0t8ETU7Q
> XG79hJEpa1
> HvTn9axLHH
> KHx3khpRdILHvYE

You can train incremental mode specifically on these samples (and
ideally on many more, if you have).  You can create a fake.pot file with
contents like this:

:H5EWip7gw916385
:UdAjKX4vRS2mt2
:GgDjZEVsOXh0Fb
:Pz7ocjGw1liQ7u
:6AdI1n9VYwDMOR
:G2eQNRy0t8ETU7Q
:XG79hJEpa1
:HvTn9axLHH
:KHx3khpRdILHvYE

then generate a custom.chr file with:

./john --pot=fake.pot --make-charset=custom.chr

Then use "--incremental=custom".  If you think the above samples do not
cover the charset exhaustively, yet you want to, you may list the missed
characters (in your desired order) on an "Extra = ..." line in the
"[Incremental:Custom]" section in john.conf.  Like this:

[Incremental:Custom]
File = $JOHN/custom.chr
Extra = ABCXYZorwhateverelse

BTW, I assume these samples were "previously cracked" by some different
means?  Perhaps that's your only real chance - either to also crack your
target password by such means, or to obtain way more samples, which
would hopefully expose much more of a pattern (if there's one).

> There are more uppercase letters (61) than lowercase letters  (45) and
> the digits (27) are in the middle, not just at the end. If I have to
> wait through the all-lowercase options with digits at the end before I
> get to the options that are more likely in my problem space, am I not
> waiting for at least 50% of the possible incremental mode combinations
> that are almost certainly not the solution?

No, not "at least 50%".  The more different character types you include
in a candidate password pattern, the larger the corresponding fraction
of the password space is.  This means that the fraction you'd like to
focus on is the largest one, and consequently the fractions you'd have
searched through first correspond to (way) less than 50%.

The shortest password samples you have above are of 10 characters long.
Here are the numbers of 10-character strings consisting of different
character sets:

26^10 = 141167095653376 (e.g., all lowercase)
36^10 = 3656158440062976 (e.g., lowercase and digits)
62^10 = 839299365868340224 (lowercase, uppercase, and digits)

Expressed as percentages of 62^10, these are:

100*26^10/62^10 = 0.017%
100*36^10/62^10 = 0.44%
100*62^10/62^10 = 100%

So you'd have searched less than 0.5% of your total password space if
you started by searching solely all-lowercase and lowercase with digits,
before starting to include uppercase letters.

Besides, the default Alnum mode, which you were running, actually starts
to occasionally include uppercase letters much earlier.

Anyway, given that you do have those samples, of course you'll improve
your chances by focusing the attack accordingly, as I described above.

> Based on the known passwords, I am fairly certain that the random
> password is not all lower case.

Of course.  I am "fairly" certain of that too, simply because there are
relatively few all-lowercase strings in your target search space.  But
does this mean it's worthwhile to exclude all-lowercase from your search
initially?  Possibly not, and for the same reason - there are relatively
few all-lowercase strings in your target search space, so excluding them
also does not buy you much.

Besides, just how certain are you?  Based on the math above, for length
10 you'd need to be more than 99.98% certain in order for skipping
all-lowercase at first to be beneficial.

I can see how this can feel counter-intuitive, but that's what we have.

Anyway, once again, you can reasonably use your samples.  That's
different from simply skipping all-lowercase, and would be expected to
help... just not enough to get your password cracked:

> It's certainly a slow hash. On my test hardware (which is a known
> problem), I'm only getting 78 p/s. After 17 hours, I'm only up to
> "siesendo..siellity" in incremental mode. (When I have the rules
> configured the way I want them, I'll move the configuration to faster,
> more expensive hardware.)

At that speed, or even 1000 times that speed, you're most likely out of
luck winning that lottery, not even by focusing the search based on the
(few) samples you have.

This reminds me of a similar scenario where having a set of samples like
yours was enough to win the lottery: it was the case with car
manufacturers' generated descrypt passwords on older versions of QNX.
They looked pretty much like yours, their characters were also
non-uniformly distributed (so training a custom incremental mode worked
well), but they were limited to length 8 (a limitation of descrypt) and
could be tested at a few billion per second (on multiple FPGAs).
However, your longer passwords and slower hash type look enough to
prevent this attack from succeeding in your case in practice.

I don't know where your passwords come from and what they're for, but
your better bet might be that your target password does _not_ follow
these patterns - e.g., that the usual process might somehow have been
subverted and the password was chosen by a human.  While very unlikely,
it is not nearly as unlikely as winning your described lottery.  What
you can more reasonably do is search through common password lists such
as RockYou.  You can't do much more than that at those speeds anyway.

Your other reasonable bet is to try and figure out the algorithm used to
generate those passwords (such as by obtaining and reverse-engineering
the code) and look for flaws in it.  For example, are passwords possibly
derived from PRNG seeds?  If so, iterate through the seeds instead of
passwords (and generate passwords from the seeds, then test them).  Are
the seeds (too) large but non-random?  If so, iterate through their
likely values.

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.