john-users - Re: Performance John in the cloud

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACxgy5z5NHXWagLNU9BZSXCBfdmZKA9XRQm+1LwJj0oHjY82-Q@mail.gmail.com>
Date: Sat, 15 Aug 2020 23:21:06 -0400
From: Powen Cheng <madtomic@...il.com>
To: john-users@...ts.openwall.com
Subject: Re: Performance John in the cloud

All the benchmarks are excellent references.Thank you Alexander for taking
the time to do these benchmarks and cost breakdown.

I did specifically ask about ethereum-opencl as discussed today on GitHub
issue #3222 but that format doesn't currently support the scrypt KDF.

As for the cost / performance. I think I would have to wait for the
hardware/software to catch up in the near future so I could use the GPU
with scrypt KDF support to make this worthwhile.

Currently the CPU way is just a bit expensive at the moment and too slow in
my opinion.

As for the test, I was wondering how john was able to perform the benchmark
with
$ john -test -form=ethereum-opencl

I only need to attack a wallet with 262144 iteration so 11k+ on NVIDIA
Tesla V100 in p3.2xlarge does sound better.

And I'd love to get my hands on an AMD EPYC with 64 cores.

Powen

On Sat, Aug 15, 2020 at 10:55 PM Solar Designer <solar@...nwall.com> wrote:

> On Sat, Aug 15, 2020 at 11:06:13PM +0200, Solar Designer wrote:
> > on a c5a.24xlarge instance (96 vCPUs, AMD EPYC 7R32)
>
> BTW, here are some other benchmarks on that CPU, 96 threads:
>
> Benchmarking: descrypt, traditional crypt(3) [DES 256/256 AVX2]...
> (96xOMP) DONE
> Many salts:     407961K c/s real, 4254K c/s virtual
> Only one salt:  62797K c/s real, 654782 c/s virtual
>
> Benchmarking: md5crypt, crypt(3) $1$ (and variants) [MD5 256/256 AVX2
> 8x3]... (96xOMP) DONE
> Many salts:     4608K c/s real, 48002 c/s virtual
> Only one salt:  3801K c/s real, 39523 c/s virtual
>
> Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X3]...
> (96xOMP) DONE
> Speed for cost 1 (iteration count) of 32
> Raw:    86832 c/s real, 902 c/s virtual
>
> Benchmarking: sha512crypt, crypt(3) $6$ (rounds=5000) [SHA512 256/256 AVX2
> 4x]... (96xOMP) DONE
> Speed for cost 1 (iteration count) of 5000
> Raw:    64060 c/s real, 669 c/s virtual
>
> 48 threads works slightly better for descrypt:
>
> $ OMP_NUM_THREADS=48 john -test -form=descrypt
> Will run 48 OpenMP threads
> Benchmarking: descrypt, traditional crypt(3) [DES 256/256 AVX2]...
> (48xOMP) DONE
> Many salts:     418480K c/s real, 8718K c/s virtual
> Only one salt:  79034K c/s real, 1651K c/s virtual
>
> Not bad for one CPU chip.  Just a few years ago these speeds at descrypt
> and md5crypt and sha512crypt were only achieved on GPU.  Of course,
> modern high-end GPUs are a few times faster at these three hash types...
> but not at bcrypt.
>
> That speed at bcrypt is the highest I see so far for any one chip - we
> reach higher speeds on ZTEX boards, but those have four FPGA chips each,
> and NVIDIA Tesla V100 GPU doesn't reach the above speed (but gets very
> close).  I guess an AMD EPYC with 128 threads (64 cores) will show even
> better speed; I just haven't had access to one yet.
>
> Of course, this isn't as energy-efficient as the FPGAs are, but it is a
> higher speed per chip.  We'll need to support larger FPGAs to beat that.
>
> > c5a.24xlarge is currently priced at $1.56+/hour spot, $3.696 on-demand.
> > Our Bundle (beyond the free trial) costs $0.64/hour on this instance.
>
> Alexander
>
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.