john-dev - Re: [GSoC] John the Ripper support for PHC finalists

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150417011228.GA22693@openwall.com>
Date: Fri, 17 Apr 2015 04:12:28 +0300
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: [GSoC] John the Ripper support for PHC finalists

Agnieszka,

I've just suggested to you (off-list) that you work on the PHC finalists
in a specific order.  (We may revise it later, though.)  I'd like to
provide the rationale and additional advice:

POMELO - you're already working on it.

Parallel - it's the simplest, and also one where good performance on GPU
is expected.  Although Parallel can efficiently be run on GPUs defensively,
with a high parallelism setting, I suggest that in your implementations
you take advantage of the higher-level parallelism coming from having
multiple candidate passwords.  That way, you'll be able to exploit SIMD
and GPUs even when attacking Parallel hashes that use a low parallelism
setting.

Lyra2 - it already includes some CUDA code.  You may look into
integrating that into JtR, as well as translating it to OpenCL (so we'd
have lyra2-cuda and lyra2-opencl).

yescrypt - it's relatively complex, so may be tricky to implement in
OpenCL.  OTOH, there's significant overlap with scrypt, which isn't a
PHC finalist per se, but is a (legacy) mode of operation of yescrypt,
and which we need supported on GPU anyway.  We already have scrypt
supported in JtR on CPU.  There are also performance numbers for scrypt
on GPU from other projects, such as from the many Litecoin miners for
r=1 and from the Hashcat project as well as from another team for other
settings.  So we'll be able to verify how close to optimal your
implementation got by checking it against those other implementations
and their published speed figures.  (For yescrypt's native modes,
though, your results would be brand new.)  Another aspect is that scrypt
(but not yescrypt's native modes) is deliberately time-memory tradeoff
(TMTO) friendly.  So you'd practice with exploiting TMTO while you work
on this.  Do not be afraid, in the case of scrypt it's easy and we're
readily familiar with it, so will provide guidance.

Makwa - its author has recently made the bold claim on the public PHC
discussions list that Makwa is as GPU-unfriendly as bcrypt (even though
for different reasons).  We should try to confirm or disprove it.

battcrypt - it should be similar to bcrypt, albeit more complex to
implement in OpenCL (needs not only Blowfish, but also SHA-512).
We should verify that it's also behaves bcrypt-like or worse in terms of
GPU attacks.

Catena - I listed this one closer to the end in part because there are
as many as four default instantiations of it.  I think we'll need to
only implement two: Catena-Dragonfly and Catena-Butterfly.  I think we
would not need to implement the -Full instantiations of these.  But it
may be clearer which ones of the four PHC focuses on by that later time.

Pufferfish - it's fairly clear how this one will behave, which on one
hand provides a way to test your optimizations against the expected
outcome, but on the other hand makes this relatively uninteresting.
Also, Pufferfish as currently specified has recently been found to be
buggy for above 2 MB m_cost, and also to be arguably too slow to be
chosen as a PHC winner even for its primary intended range of m_cost.
So it isn't a likely winner, at least in its present form, and thus is
of relatively little interest.

Argon or Argon2 - the PHC panel has not decided yet whether to let
Argon2 into the competition or not.  So it is not clear which one of
these we'd want implemented and benchmarked yet.  Once things clear up,
maybe the right Argon should be moved up the list.

I suggest that you start new john-dev threads for each PHC finalist when
you approach starting work on it, or even separately for each finalist's
C and OpenCL implementations.  This current thread is covering too many
related yet distinct topics now.

Thanks,

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.