Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 31 Oct 2014 06:33:00 +0100
From: magnum <>
Subject: Re: descrypt speed

On 2014-10-31 06:02, Royce Williams wrote:
> On Thu, Oct 30, 2014 at 6:31 PM, magnum <> wrote:
>> On 2014-10-30 16:49, Royce Williams wrote:
>>>> Using -fork=4 on a quadcore+HT and GTX980 I got over 82 Mc/s.
>>> On my 8-core AMD and GTX970, using fork=2 gets me 52 Mc/s, which is
>>> much better than no fork (~35 Mc/s).  fork=3 settles in around 54
>>> Mc/s.  Forking more than 3 doesn't materially increase the c/s rate.
>> Solar, Sayantan, all,
>> Why is this? This is bordering candidate generation bottleneck but that's
>> not quite the problem, is it? So what is the bottleneck? Could we do
>> something to make it faster without forking or *is* it just candidate
>> generation?
> We may need to determine if it's happening to others as well.
> Something odd is happening that may be on my side.

Your numbers are not particularly bad afaik. I'm concerned about "fork" 
being beneficial at all, for anyone. For a semi-slow format like this I 
think a 50% speedup from using fork means we should be able to improve 

> Going back through my config/make cycle, I didn't notice this at first:
> ptxas info    : Compiling entry function
> '_Z13kernel_phpassPhP12phpass_crack' for 'sm_20'
> In fact, all of the appearances of sm_[0-9+] in my ./configure and
> make results appear to be using sm_20.  Strings on the john binary
> only shows sm_20 in use.
> On a GTX970, shouldn't this be sm_52?

You can force this by editing NVCC_FLAGS in Makefile. Add something like 
"-arch sm_50" (or 52). But I doubt it will make much difference and it 
will only affect CUDA formats.


Powered by blists - more mailing lists

Your e-mail address:

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.