john-dev - Re: Using Probabilistic Context Free Grammars (Was precomputed attacks)

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJ9ii1HaerHL03bHGtJ5PnVJ8xQ03zS7p14ibpf7p9h4Lfy-yg@mail.gmail.com>
Date: Mon, 29 Jun 2015 13:17:30 -0400
From: Matt Weir <cweir@...edu>
To: john-dev@...ts.openwall.com
Subject: Re: Using Probabilistic Context Free Grammars (Was
 precomputed attacks)

>> The main downside for me is that PCFG is not easily parallelizable so no
GPU for it.

Well I guess *technically* it is parallelizable, it's just that the memory
requirements currently aren't compatible with implementing the full PCFG in
the GPU, (aka you are not going to load up your dictionaries there). In a
CPU setup you can do a whole lot of parallelization with splitting up the
work on filling out the terminal structures and actually making the
guesses. In that setup I'll admit the core PCFG "next" function that finds
the next terminal structure to use would be a blocking item but I suspect
even that could be parallelized quite a bit if you are willing to make
guesses that aren't exactly in probability order, (aka you are willing to
try a slightly lower probability password guess before the highest possible
one remaining). At the end of the day it really is a wordlist attack, (it's
just creating very fine grained mangling rules), so worst case you can even
pre-generate the mangling rules and save them to disk.

So yeah, not easily parallelizable ;p If that was a priority then there at
least is a way forward on CPU parallization. For GPU cracking it might be
better to handle that like most implementations currently do with Wordlist
attacks by giving a set of mangling rules for a small subset of dictionary
words. It's just in this case those mangling rules would be selected by the
PCFG vs hand crafted.

>> I also find your performance comparison unfair given that you don't take
into account implementation speed

When talking about performance I think it comes down the the law of
diminishing returns. There are a couple of costs associated with a password
cracking session
1) The cost of making a plaintext guess. The overhead of incremental mode,
PCFGs, PRINCE, etc come into play here
2) The cost of checking a guess. This is the cost of hashing/comparing.
3) The number of guesses required to crack a password. Aka incremental will
likely require less guesses than dumbforce mode

A true performance comparison would take into account all three of those
and I tried to address the actual time involved in my dissertation,
(section 5.3.3 Time and Memory Requirements). I'll admit in retrospect I
needed to run longer sessions though, (aka I did a "dumb" comparison that
included the amount of time JtR used starting up vs checking the guesses
per second output that JtR gives out), and since then JtR's incremental
mode has gotten even faster while my code has gotten slower with added
features....

That being said, with the cost of checking guesses going up with stronger
hashing algorithms, the cost of making a plaintext guess in comparison to
the overall cost goes down. Therefore it may be optimal to spend more time
making targeted guesses to reduce the overall number of guesses made.
That's why right now I'd say the overhead is probably too high to make
PCFGs the best choice for fast hashes like unsalted MD5s, but when the
hashing algorithm gets more expensive it starts to be more useful.

That's also why most papers focus on the guessing efficiency of different
algorithms, (the number of guesses required to crack a certain percentage
of passwords). It is easy to make a repeatable test using that and it's
assumed the cost of hashing will go up while future improvements to
the algorithm will make it faster, so the most relevant piece of info to
share is the guessing efficiency. I'm not arguing that performance
statistics aren't worthwhile, (especially when looking to translate them to
the real world). It's just that if you are doing the initial research it's
more appealing to look at which underlying technique might be useful vs
optimizing your code.

Of course on the John-dev board we are worried about real world issues ;p I
don't want to sound like I'm advocating using PCFG techniques for
everything. I do think it could be useful in certain situations though.

Matt

On Mon, Jun 29, 2015 at 12:17 PM, Alain Espinosa <alainesp@...ta.cu> wrote:

>
>
> -------- Original message --------
> From: Matt Weir <cweir@...edu>
> Date:06/29/2015 11:09 AM (GMT-05:00)
> To: john-dev@...ts.openwall.com
> Cc:
> Subject: [john-dev] Using Probabilistic Context Free Grammars (Was
> precomputed attacks)
>
> ...Now for the downsides. The overhead is pretty high so against fast
> hashes the increase in precision isn't worth the slower guessing rate. Also
> the grammar is imprecise enough that against really slow hashes hand
> written mangling rules still are the way to go. So it currently performs
> best against medium speed hashes.
>
> I read your Dissertation, is very interesting. The main downside for me is
> that PCFG is not easily parallelizable so no GPU for it.
>
> I also find your performance comparison unfair given that you don't take
> into account implementation speed (this is a very common problem in papers
> talking about password generation). For example if brute force is 4x faster
> than incremental we need to test it with 4x bigger tries, so we had the
> same time spent in each attack.
>
> One thing that wake my interest is entropy calculation. One way to view
> this is that 10 characters passwords are only 4.8x stronger that 8
> character password. Given my experience I was expecting more, or perhaps I
> was too used to contest hashes.
>
> Regards,
> Alain
>
>
>
>

Content of type "text/html" skipped
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.