john-users - Re: GPU cracking Argon2

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20191216213022.GA11203@openwall.com>
Date: Mon, 16 Dec 2019 22:30:22 +0100
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: GPU cracking Argon2

On Mon, Dec 16, 2019 at 08:47:59PM +0000, David Taylor wrote:
> Am I correct in my understanding that john does not have the capability to use GPUs (OpenCL) to crack Argon2?

Yes, that's correct.

> Does anyone know of any cracking software that can?

As far as I'm aware, there's currently no password cracker with Argon2
support on GPUs.

Moreover, as I understand you need more than simple support - you need
optimized code you could reasonably use to estimate attack performance.

For such estimates, you can take JtR Argon2 speeds on CPU and then scale
them up by GPU/CPU memory bandwidth ratio.  Optimal implementations of
Argon2 on both kinds of devices are supposed to bump into memory
bandwidth, as long as a sufficient number of hashes are being computed
concurrently - which is generally the case in offline password cracking.

For example, latest Intel server CPUs support up to 6 channels of
DDR4-2666 memory per CPU chip.  That's 128 GB/s of theoretical peak
bandwidth if you configure the hardware optimally (install 6 DDR4-2666
DIMMs into appropriate sockets).  On older CPUs, the bandwidth will be
a lot lower - you'll need to look it up, if relevant.

NVIDIA Tesla V100, which is what you probably have 8 of in the cloud,
has 900 GB/s of theoretical peak memory bandwidth per GPU.

So a recent and optimally configured dual-CPU server with 12 DIMMs will
have 256 GB/s, whereas an 8x V100 GPU rig will have 7200 GB/s.  Thus,
the GPU rig can be estimated to be 28 times faster if certain conditions
are met.  If CPUs and their RAM are also used for attack, then 29 times.

Intel Xeon Platinum CPUs allow for up to 8-CPU configurations (and
correspondingly more memory channels), but that's prohibitively
expensive compared to use of dual-CPU servers.  So I expect that you
wouldn't use bigger than dual-CPU servers (in a password hashing cluster
if scaling is needed).

You can lower the GPU advantage by using yescrypt with large ROM -
ideally, exceeding the combined memory of GPU cards in an attack rig.
8x V100 may have a total of 128 GB or 256 GB of RAM.  You can affordably
have a password hashing server's RAM larger than that, and even if it is
not then many other additional limitations (not seen with Argon2) will
come into play.

Alexander

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.