john-users - Re: our own training pseudo contest before CMIYC 2012

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120711135239.GB11965@openwall.com>
Date: Wed, 11 Jul 2012 17:52:39 +0400
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: our own training pseudo contest before CMIYC 2012

On Wed, Jul 11, 2012 at 01:33:06PM +0200, Frank Dittrich wrote:
> On 07/11/2012 10:35 AM, Solar Designer wrote:
> > Our priority in the
> > real contest should be getting the hashes cracked, not learning a new
> > tool.  For example, in the real contest, unless we choose to focus
> > solely on our own stuff rather than on achieving a good score, I would
> > not even look at MJohn, unless I already tried it out in a similar
> > setting by that time and liked it (or at least did not hate it much).
> > I think it will/should be similar for others.
> 
> Right. Will Aleksey have something ready which can be tested about a
> week before the contest starts?

Maybe.

> Even if we test it in a pseudo contest, we have to be able to integrate
> results of users who don't use MJohn during the real contest.
> At least, we need to support users who just want to drop their john.pot
> files (or deltas compared to previous submissions or deltas compared to
> the current "central/global" john.pot on the server).

Yes, I think we should reuse/update last year's scripts for that (unless
this year's submission requirements are very different, which we won't
know in advance), and keep them separate from MJohn.

> > FYI, scripts I used on our contest server during CMIYC 2010 and 2011 did
> > not use --show=LEFT, but rather processed john.pot files directly.
> 
> IIRC, we were lucky that the input files already used the preferred
> canonical hash representation.

IIRC, there was some magic to unify/convert some hash encodings.  This
magic will have to be updated for the subset of hashes available in the
new contest, if different from last year's or/and if we're given hashes
in a different form.  I'm afraid that it's a task to work on _during_
the contest.  There are so many possible hash types and so many possible
encodings for some of them that we can't fully prepare in advance.

> I imagine doing this with scripts will be harder if there are multiple
> valid representations for the same hash, and just one of those
> representations is used for john.pot lines (not necessarily the one that
> is found in the input files).

Yes.  However, I am unsure if --show=LEFT solves this problem or not,
with whatever implementation of it.  This would take some thinking that
I am too lazy/busy for right now.

> How easy is creating a GPU build for an average user?

I think the most tricky part is in setting up a system to support such
builds - working driver and SDK.  Building JtR on a properly setup
system is relatively easy, although the person may need to know and
specify their path to the SDK, etc.  Also, for CUDA some compile-time
tuning for the specific GPUs is desirable.  (For OpenCL, this is
automated and it happens on John invocation.)

> (I didn't look at the how-tos (wiki. doc/*) for quite some time, so I
> have no idea.)

We have some info here:

http://openwall.info/wiki/john/GPU-setup - general system setup
http://openwall.info/wiki/john/GPU - JtR specifics

but we need more and better instructions.

> Could we prepare a linux image with a pre-built john for GPU for 64bit
> ubuntu, which can be copied onto a 16GB USB stick (or even 8 GB, if
> that's considered big enough)?

We could, but I doubt that it's a timely thing to do.  A person who
can't set this up on their own will surely also use the GPU so
non-optimally that it's probably no better than a CPU.  Our current GPU
support is, with few exceptions, only for people who know what they're
doing pretty well and have tried it out before the contest.  For
example, all of our "fast" hash implementations on GPU are terribly
inefficient because of the non-resolved CPU/GPU communication
bottleneck (they're present in the tree as a development milestone only).
Yet a person who uses a precreated USB image would likely miss this
detail and go ahead cracking e.g. raw MD5 on GPU, which wouldn't be a
useful contribution in the contest.

It would make more sense to provide a system image for a "slave" system
(not necessarily GPU-enabled) that would be controlled (ideally through
a protocol restricting what can be done) by one of our more experienced
team members, using a "master" tool that would make controlling of
multiple systems practical.  But we're unlikely to do it this year.

> > For processing 2+ separate inputs, I'd currently use my
> > Perl scripts to preprocess wordlists, although it does make sense to get
> > similar functionality into John proper eventually (not for this contest).
> 
> Are those scripts located on the contest server?

I was referring to double.pl, triple.pl, quad.pl, mix.pl scripts that I
shared in here before.

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.