Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 12 Mar 2014 23:13:27 +0400
From: Solar Designer <>
Subject: Re: using scrypt for user authentication

On Tue, Dec 31, 2013 at 08:56:57AM +0400, Solar Designer wrote:
> YACoin (or YAC for short), a new alt-coin, is using revised scrypt (with
> Keccak and ChaCha20 instead of SHA-256 and Salsa20) with N increasing
> over time (it's tied to real time).  Currently YAC is at N=2^15
> (Nfactor=14), which means 4 MiB, and here's how it will be increasing:
> At the current 4 MiB, YAC is about as fast to mine on CPU as on GPU:

Earlier today, I noticed that a surprisingly fast GPU miner was added to
that wiki page, doing 5.04 kh/s on overclocked GTX 780.  This is several
times faster than what CPUs do, and twice faster than best speeds
reported for AMD GPUs.  This is also surprising in that AMD GPUs
generally outperform NVIDIA's at symmetric crypto.

So I went to verify, since luckily we have a GTX TITAN.  I did a git
checkout of CudaMiner, built it, figured out how to use it, and after a
little bit of tuning got an even higher speed: I added it to the wiki as
6.83 kh/s, although actually it changes with time a bit (the highest I
saw was 6.90 kh/s), maybe because of boost or/and pool delays.  And yes,
the shares are being accepted by the pool just fine, so this is for real.

This is more than twice faster than the previous best YAC mining speed I
had for CPUs (also added to that wiki page) - ~3 kh/s on 2x E5-2670 (a
total of 16 cores, 32 hardware threads).  Thus, there's a 4x speedup
per-chip, when comparing the TITAN vs. one E5-2670.  (And the TITAN is
actually cheaper than one E5-2670.)

It's also important to note that only 2.5 GiB out of 6 GiB of TITAN's
memory is in use.  (I could make the miner use more memory with other
settings, but this is what appeared to provide optimal speed now, not
needing to use more memory yet.)  Thus, I think the GPU/CPU speed ratio
would remain about the same when YAC moves to higher Nfactor, for 8 MiB.

And the TITAN is likely to remain faster than its contemporary CPUs even
at 16 MiB, and possibly at some larger sizes as well.  The current 4x
performance gap is just too much to be fully removed by the move from
8 MiB to 16 MiB.  (Perhaps only a fraction of it would be removed.)

BTW, this is with "lookup gap" (TMTO factor) of 6.  (I tried 5 and 7,
and either of these is slightly slower.)  Previously, scrypt-based
cryptocurrency miners typically found the optimal "lookup gap" to be 2.

> That's with r=1 and p=1.  With higher r, scrypt might be more
> GPU-friendly (larger sequential accesses to off-chip memory).


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.