john-users - Re: 'PassGAN: A Deep Learning Approach'

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ9ii1GN=7NWhmF2B9ZccS5b0zpbdD0NWVD=wjC+4rLOBjFaSQ@mail.gmail.com>
Date: Tue, 26 Sep 2017 14:17:42 -0400
From: Matt Weir <cweir@...edu>
To: "john-users@...ts.openwall.com" <john-users@...ts.openwall.com>
Subject: Re: 'PassGAN: A Deep Learning Approach'

Tim,
    I'll have to admit I read though your reply several times and have
all sorts of questions! For example, did you use NuPIC for your HTM
implementation and if so what were your impressions of it?

Circling back to passwords, I'd be interested in hearing your thoughts
on what might be a good approach for machine learning + password
security. You obviously seem to have a lot of practical experiences in
the subject.

Personally I'd like to see research move beyond looking into the
conditional probability of letters and instead focus on identifying
the underlying thought process behind password selection. Aka as
humans we're pretty good at looking at passwords and going:

1qazse4rfvgy7 = keyboard walk
p@...0rd = l33t sp33k
1qazp@...0rd = keyboard walk + l33t speak
johnusersmailinglist123 = 4 words + digits

Just automating that, even without adding in guess generation in
probability order, would be nice. Right now most of that is done via
simple pattern matching (zxcvbn is a good example
https://github.com/dropbox/zxcvbn). There's certainly areas for
improvement.


On Tue, Sep 26, 2017 at 9:55 AM, Tim Yardley <yardley@...il.com> wrote:
> Matt,
>
> I agree with your analysis. Even the CMU work is just so-so in this
> particular domain. For pubicly released toolsets though, it's not bad.
>
> To briefly explaijn... In private work, I applied a slightly adapted
> HTM model and external data sources to build behavioral models based
> on the presumed user profile (built via google searchs as an example
> for related content to the email address or name) and that type of
> approach was, let's say... very successful even in the cases of common
> names. The behavioral profiles I built also had less tightly bound
> criterium as well that could be presumed in some way from the username
> or other information. If all of that failed, it applied a "general
> model" that was an aggregate of preferences across different profiles.
> Just OPSEC applied in an automated way really.
>
> In reading the PassGAN paper, I applauded the concept of applying GANs
> to this, but my applause were short lived, sadly.
>
> Tim
>
>
> On Tue, Sep 26, 2017 at 8:25 AM, Matt Weir <cweir@...edu> wrote:
>> Oh, and my apologies for typoing your name Jeroen!!! Just realized
>> that after hitting send.
>>
>> Matt
>>
>> On Tue, Sep 26, 2017 at 9:23 AM, Matt Weir <cweir@...edu> wrote:
>>> Thanks for sending that along Jeoren!
>>>
>>> I've gone through that paper a number of times now. As background for
>>> the people on this mailinglist who don't want to read it, the paper
>>> describes using Generated Adversarial Networks (GANs) to train a
>>> neural network to create password guesses. It a ways, it is very
>>> similar to the earlier work done by CMU on using neural networks to
>>> crack passwords. CMU's code is here:
>>>
>>> https://github.com/cupslab/neural_network_cracking
>>>
>>> And if you actually want to get that code to run I highly recommend
>>> checking out Maximilian's tutorial here:
>>>
>>> https://www.password-guessing.org/blog/post/cupslab-neural-network-cracking-manual/
>>>
>>> Both the PassGAN and the CMU teams generate guesses much like JtR
>>> --Markov and --Incremental modes by using the conditional
>>> probabilities of letters appearing together. For example, if the first
>>> letter is a 'q' then then next letter will likely be a 'u'. A more
>>> sophisticated example would be, if the first three letters are '123',
>>> then the next letter will likely be a '4'.
>>>
>>> Where PassGAN is different from the CMU approach is mostly from the
>>> training stage as far as I can tell. While I can't directly compare
>>> the two attacks since I'm not aware of the PassGAN code being publicly
>>> released, at least based on reading the papers the CMU approach is
>>> much, much more effective.
>>>
>>> Actually the PassGAN paper is a bit of a mess when it comes to looking
>>> at other password cracking approaches. For example it uses the
>>> SpiderLab ruleset for JtR vs the default one, or --single. The actual
>>> results of PassGAN were very poor, and while the team said that
>>> combining PassGAN with Hashcat's best64 ruleset + wordlist cracked
>>> more passwords than just running best64, they didn't bother to
>>> contrast that with other attack modes + best64.  Long story short, the
>>> research is interesting but if you are looking to use neural networks
>>> for generating password guesses the current go-to is still the CMU
>>> codebase.
>>>
>>> Matt
>>>
>>> On Tue, Sep 26, 2017 at 6:33 AM, Jeroen <spam@...lab.nl> wrote:
>>>> FYI: [1709.00440] PassGAN: A Deep Learning Approach for Password Guessing
>>>> @<https://arxiv.org/abs/1709.00440>.
>>>>
>>>> Cheers,
>>>>
>>>> Jeroen
>>>>
>>>>
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.