john-users - Re: 'PassGAN: A Deep Learning Approach'

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJ9ii1ECOX7gWWXYyqywbNR=jtQXi39d53WJ_sfmVbWnJ-h9XQ@mail.gmail.com>
Date: Mon, 2 Oct 2017 13:57:12 -0400
From: Matt Weir <cweir@...edu>
To: "john-users@...ts.openwall.com" <john-users@...ts.openwall.com>
Subject: Re: 'PassGAN: A Deep Learning Approach'

>> Anyway, hopefully my rant above is somewhat interesting.

Absolutely, and thank you once again! I appreciate you taking the time
to reply to everyone on this list. Also thanks for the offer to talk
about this in a call but I'd probably just waste your time. While i'm
trying to keep up with various machine learning concepts, I'm still
squarely in the hobbyist mode and am unlikely to do anything serious
for a while. I'm sure this is useful to other people on this list
though... So thanks again!

I am curious how you actually generated your guesses to input into
JtR. For example, did you run a "normal" cracking process but when you
found out their dog's name was 'fido' you would run that word + a lot
of mangling rules? To put it another way, were you using the ML for
information gathering then using more traditional rules to make use of
the collected data, or were you using ML to identify mangling rules
and order them as well?

Also is any of your code public?

Thanks again,
Matt

On Tue, Sep 26, 2017 at 10:27 PM, Tim Yardley <yardley@...il.com> wrote:
> Matt,
>
> Happy to provide some elaboration, but I won't go into a lot of detail here for various reasons. I would be happy to have a call to discuss aspects in more detail as well if you want to discuss in depth and timing allows.
>
> On to your questions...
>
> I played with nupic as well as some other toy variants of the standard HTM model. Each of them (at that time) was lacking to a certain degree. In early versions there were also a LOT of bugs. I ended up rolling my own code for pieces of what I was doing that departed from the theorized direct (1:1) mapping and allowed for some more flexibility for the particular task at hand. Another way of saying that is that I borrowed ideas and concepts, and even some base code from that effort and then had to diverge from that. I haven't had a chance to even look at that project or its code in several years (and my original implementation used other ANN style techniques, so HTM was a port in hopes of improvement).
>
> Some issues arose purely in the mapping of concepts, requiring some adaptation. For example, information that was gathered was fuzzy and relatively unstructured or correlated in any way. As a result, it had to be mapped and grouped (sort of a pre-classification) based on other methods and then this was repeated in stages. I provided a human-factors or psychology style mapping in the form of intelligence gathering that helped facilitate the categorization/classification. For example, if someone is determined to have kids, their passwords may have a correlation to that. If someone is a sports fan, their passwords may be correlated to that. Men have certain pre-dispositions toward certain conventions, as do women. In other words, I assisted information gathering by creating a target list of search terms to combine with the user profile input to create a "refined" model of who that clump of data may correlate to. long story short, refined is in quotes for a reason there.
>
> There were a number of other approaches that came to bear in the end, but I won't go through and iterate all of them here for the sake of my own sanity. I also had some design constraints in implementation based on how I had structured some of my prior efforts (and what I effectively inherited due to laziness and not wanting to re-implement my prior work).
>
> Some of the decisions that were made are akin to an architectural decision that a comparison can easily be drawn to the types of problems that have to be addressed in real-time streaming based analysis (online decisions, etc) when approaching machine learning topics. In other words, predictions had to be made on the fly (in this case fed to JTR since that was the easiest way for me to then run them against the hashes), the algorithm had to learn constantly and have to adapt to the information that is coming in (OPSEC information gathering was continuous because new information could be indexed at any time), it has to run in a generally autonomous way (unsupervised and as automated as possible -- although I had a konomi code-style feedback mechanism that allowed me to "help" the system when it was failing horribly by providing it an additional level of insight from my own head), etc etc.
>
> I've toyed with the concept of a semi-trained reverse anomaly detection as a way of providing further exploration of this space, but neither have the energy or the spare brain cycles to go about implementing a POC anymore. Too much to do in my day job, and too many personal interests to pursue it in the late hours as well. Basic thought of that concept is similar to determining what is "normal' in password selection by training on passwords that people picked, and then trying to reverse the pattern of why they may have picked them by changing the direction of the analysis. That would give the perceived norm, and then weeding out potential password options by determining if they are "anomalies" from that norm (in other words they don't fit the model of how that profile would have picked a password). A problem with that approach is that it has an inherent assumption that people are always predictable (to a degree) and that passwords therefore follow that same concept.
>
> With the uptick in randomly generated passwords (and leveraging of password managers), bio based authentication, etc etc ... I am probably less interested in password identification than I am in other topic areas. I'm generally a sucker for wanting to understand the why behind decisions people make though, so it will always be a curiosity of mine. I also have a general inclination to avoid IRBs, so purely studying people for that purpose won't happen either.
>
> Anyway, hopefully my rant above is somewhat interesting.
>
> Tim
>
> --
>
> Tim Yardley
> yardley@...il.com
>
>> On Sep 26, 2017, at 1:17 PM, Matt Weir <cweir@...edu> wrote:
>>
>> Tim,
>>    I'll have to admit I read though your reply several times and have
>> all sorts of questions! For example, did you use NuPIC for your HTM
>> implementation and if so what were your impressions of it?
>>
>> Circling back to passwords, I'd be interested in hearing your thoughts
>> on what might be a good approach for machine learning + password
>> security. You obviously seem to have a lot of practical experiences in
>> the subject.
>>
>> Personally I'd like to see research move beyond looking into the
>> conditional probability of letters and instead focus on identifying
>> the underlying thought process behind password selection. Aka as
>> humans we're pretty good at looking at passwords and going:
>>
>> 1qazse4rfvgy7 = keyboard walk
>> p@...0rd = l33t sp33k
>> 1qazp@...0rd = keyboard walk + l33t speak
>> johnusersmailinglist123 = 4 words + digits
>>
>> Just automating that, even without adding in guess generation in
>> probability order, would be nice. Right now most of that is done via
>> simple pattern matching (zxcvbn is a good example
>> https://github.com/dropbox/zxcvbn). There's certainly areas for
>> improvement.
>>
>>
>> On Tue, Sep 26, 2017 at 9:55 AM, Tim Yardley <yardley@...il.com> wrote:
>>> Matt,
>>>
>>> I agree with your analysis. Even the CMU work is just so-so in this
>>> particular domain. For pubicly released toolsets though, it's not bad.
>>>
>>> To briefly explaijn... In private work, I applied a slightly adapted
>>> HTM model and external data sources to build behavioral models based
>>> on the presumed user profile (built via google searchs as an example
>>> for related content to the email address or name) and that type of
>>> approach was, let's say... very successful even in the cases of common
>>> names. The behavioral profiles I built also had less tightly bound
>>> criterium as well that could be presumed in some way from the username
>>> or other information. If all of that failed, it applied a "general
>>> model" that was an aggregate of preferences across different profiles.
>>> Just OPSEC applied in an automated way really.
>>>
>>> In reading the PassGAN paper, I applauded the concept of applying GANs
>>> to this, but my applause were short lived, sadly.
>>>
>>> Tim
>>>
>>>
>>> On Tue, Sep 26, 2017 at 8:25 AM, Matt Weir <cweir@...edu> wrote:
>>>> Oh, and my apologies for typoing your name Jeroen!!! Just realized
>>>> that after hitting send.
>>>>
>>>> Matt
>>>>
>>>> On Tue, Sep 26, 2017 at 9:23 AM, Matt Weir <cweir@...edu> wrote:
>>>>> Thanks for sending that along Jeoren!
>>>>>
>>>>> I've gone through that paper a number of times now. As background for
>>>>> the people on this mailinglist who don't want to read it, the paper
>>>>> describes using Generated Adversarial Networks (GANs) to train a
>>>>> neural network to create password guesses. It a ways, it is very
>>>>> similar to the earlier work done by CMU on using neural networks to
>>>>> crack passwords. CMU's code is here:
>>>>>
>>>>> https://github.com/cupslab/neural_network_cracking
>>>>>
>>>>> And if you actually want to get that code to run I highly recommend
>>>>> checking out Maximilian's tutorial here:
>>>>>
>>>>> https://www.password-guessing.org/blog/post/cupslab-neural-network-cracking-manual/
>>>>>
>>>>> Both the PassGAN and the CMU teams generate guesses much like JtR
>>>>> --Markov and --Incremental modes by using the conditional
>>>>> probabilities of letters appearing together. For example, if the first
>>>>> letter is a 'q' then then next letter will likely be a 'u'. A more
>>>>> sophisticated example would be, if the first three letters are '123',
>>>>> then the next letter will likely be a '4'.
>>>>>
>>>>> Where PassGAN is different from the CMU approach is mostly from the
>>>>> training stage as far as I can tell. While I can't directly compare
>>>>> the two attacks since I'm not aware of the PassGAN code being publicly
>>>>> released, at least based on reading the papers the CMU approach is
>>>>> much, much more effective.
>>>>>
>>>>> Actually the PassGAN paper is a bit of a mess when it comes to looking
>>>>> at other password cracking approaches. For example it uses the
>>>>> SpiderLab ruleset for JtR vs the default one, or --single. The actual
>>>>> results of PassGAN were very poor, and while the team said that
>>>>> combining PassGAN with Hashcat's best64 ruleset + wordlist cracked
>>>>> more passwords than just running best64, they didn't bother to
>>>>> contrast that with other attack modes + best64.  Long story short, the
>>>>> research is interesting but if you are looking to use neural networks
>>>>> for generating password guesses the current go-to is still the CMU
>>>>> codebase.
>>>>>
>>>>> Matt
>>>>>
>>>>> On Tue, Sep 26, 2017 at 6:33 AM, Jeroen <spam@...lab.nl> wrote:
>>>>>> FYI: [1709.00440] PassGAN: A Deep Learning Approach for Password Guessing
>>>>>> @<https://arxiv.org/abs/1709.00440>.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Jeroen
>>>>>>
>>>>>>
>
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.