john-users - Re: lessons to learn from the contest (was: Defcon18 "Crack Me If You Can" Complete Pot File)

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <AANLkTimJLDVA1EMri_ApOKWyeyR7PbJs6eSAxuVXaxBK@mail.gmail.com>
Date: Thu, 26 Aug 2010 09:51:50 -0400
From: Charles Weir <cweir@...edu>
To: john-users@...ts.openwall.com
Subject: Re: lessons to learn from the contest (was: Defcon18
 "Crack Me If You Can" Complete Pot File)

I think Minga raised a couple of good points. I probably should start
by saying that I have no experience cracking corporate passwords. All
my work has been with publicly disclosed passwords on the web so 99%
of the passwords/hashes I've dealt with have been for various
websites. That's part of the reason why this contest list is very
useful to me. Yes, it was artificially generated, and therefore won't
match real life exactly, (this isn't a slam. No matter how careful you
are in creating it, that's always going to be a problem), but until a
corporate AD server's hashes get posted to the net, this is the best
dataset I have to work with. What's even more interesting, (at least
to me anyway), is how the various teams adapted to it when they
realized that the contest hashes weren't like all of the web passwords
we've all gotten spoiled in cracking. I think that the combination of
different hash types, having to discover and generate new rules in the
middle of a cracking session, and the limited amount of time
available, really exposed a lot of weaknesses, (or at least room for
improvement), in people's cracking strategies.

>> From some recent data, an Active Directory has 40000 accounts in it. The
>> 2nd, 4th and 9th most common password revolve around the 3-letter months
>> prepended to a string. All in all, over 5000 of the 40000 passwords have
>> a month as part of the password. (Approx 12.5 %). To discount the
>> fact that over 12%  of the cracked hashes can be cracked with the
>> following rules, is illogical:

The above statement shows it's a real shame that Minga/KoreLogic's
talk didn't get accepted at Defcon. Those types of statistics are very
useful, and I think the talk would have helped explain the
goals/design of contest much more clearly as well. Minga, I know
you've given that talk, "Cracking 3.2 Million Passwords, Supercharging
JtR Rules" at several other conferences before, but I've only been
able to find the abstract. Any chance of you posting the slides to
your website? Also I'd really appreciate it if you could run some
other tests/results to this list. For example, how effective is JtR's
default rule set and its Single ruleset against real password hashes,
which wordlists you've had success with, (or how do you generate your
custom wordlists), etc.

>> I am curious what other statistics and datasets other users have access to.
>> I based my opinions/data/rules off of a john.pot with:
>> 3.2 million privately obtained cracked passwords. (no public-record
>> password hashes) of which 1.2 million are NTLMs. I am currently
>> working on a list of 210,000 {SSHA}
>> hashes.

I'm a huge fan of data driven research. This may stem from the fact
that I've approached this field from a research perspective, (and read
way too many bad papers about password security that are based on pure
theory). This is actually an open question to everyone, and I'll
probably go into it in more detail in Rich's thread on gathering real
world statistics, but I would love it if more people could create
rules or run analysis on real world passwords, and share the results
even if they can't share the datasets, (for perfectly obvious
reasons).

>> All in all, at least on this mailing list,  I feel that when users
>> discuss patterns
>> they have seen - or rules they have written, we get blinded by the "best way"
>> to write this rule.
>>
>>I see this as being a waste of time. If I write a rule that says:
>>
>> $2$0$1$0  instead of AZ"2010"
>>
>> I know we are supposed to use 'AZ' now - but guess what - during the amount
>> of time it takes to argue about which method is "better" - I cracked another
>> 100 accounts - and gained root on another 300 UNIX machines.

I'll freely admit that this list can be intimidating. It took me a
long while to realize that if you post something here, a couple of
things will happen
A) It will get torn up, as people will find and suggest solutions to
every possible flaw or improvement
B) You will have to be willing to argue your point if you suggest something

and more importantly
C)That's a good thing, and people really appreciate your work even as
they are suggesting improvements.

Ok, the first two points are pretty obvious, but it took me a while to
figure out the third one ;)  Take my original message about the
contest .pot file I created, which also spawned this thread. It's
really easy to miss the "Thanks" in Solar's reply when the next line
starts with "I think this was a mistake" ;) That being said, he was
right, and I've rebuilt the .pot file to take that into account, (Now
I just need to get off my lazy butt and repost it). The same goes for
the other criticisms/suggestions I've received over the years. I've
posted several rule sets before and I have really appreciated the
feedback on how to make them better, (even if I didn't follow the
suggestions since I rarely use JtR's build in rule generator anymore).
I want to get better at what I do, and if I'm wrong I want someone to
call BS on me. That being said, we as a group probably need to be more
diplomatic with new users until they realize our comments stem from
our respect for what they are doing.

Here's an example of that. I would really like to get a 'weaponized'
copy of your ruleset into JtR's jumbo patch, (and eventually into the
release version).

It's easy to focus on the fact that I put 'weaponized' in quotes, but
all I mean is that you broke up your different rule types into
categories that would need to be run separately, aka
'[List.Rules:KoreLogicRulesAppendSeason]', instead of one ruleset for
everything like [Single] does. I know from our conversations at Defcon
you were also working on sorting your rules based on the number of
guesses they generated vs. the number of passwords cracked, so that
would help as well. Also, I think it would be useful for the people on
this list to work on the rules to get them running as fast as possible
before we include them in a popular JtR build.

So what the above says is, "I really like your work, and here are a
couple of suggestions to make it better so we as a group can really
start kicking butt with it". I need to get better at phrasing that
though, and avoiding things like putting 'weaponized' in quotes since
it's so easy to misinterpret that. Also, one thing I'm really guilty
of is asking people to do more stuff once they help out a bit, (heck,
as the entire above e-mail shows). Everyone here contributes on a
volunteer basis, so please take our suggestions as us saying, "That's
a great idea, please keep working on it so I can use it myself", and
not a "You NEED to do this". Heck, I've ignored enough suggestions
over the years and they still let me continue posting ;)


Matt
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.