Date: Fri, 31 Oct 2014 06:33:00 +0100 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: descrypt speed On 2014-10-31 06:02, Royce Williams wrote: > On Thu, Oct 30, 2014 at 6:31 PM, magnum <john.magnum@...hmail.com> wrote: >> On 2014-10-30 16:49, Royce Williams wrote: >>>> >>>> Using -fork=4 on a quadcore+HT and GTX980 I got over 82 Mc/s. >>> >>> >>> On my 8-core AMD and GTX970, using fork=2 gets me 52 Mc/s, which is >>> much better than no fork (~35 Mc/s). fork=3 settles in around 54 >>> Mc/s. Forking more than 3 doesn't materially increase the c/s rate. >> >> Solar, Sayantan, all, >> >> Why is this? This is bordering candidate generation bottleneck but that's >> not quite the problem, is it? So what is the bottleneck? Could we do >> something to make it faster without forking or *is* it just candidate >> generation? > > We may need to determine if it's happening to others as well. > Something odd is happening that may be on my side. Your numbers are not particularly bad afaik. I'm concerned about "fork" being beneficial at all, for anyone. For a semi-slow format like this I think a 50% speedup from using fork means we should be able to improve something. > Going back through my config/make cycle, I didn't notice this at first: > > ptxas info : Compiling entry function > '_Z13kernel_phpassPhP12phpass_crack' for 'sm_20' > > In fact, all of the appearances of sm_[0-9+] in my ./configure and > make results appear to be using sm_20. Strings on the john binary > only shows sm_20 in use. > > On a GTX970, shouldn't this be sm_52? You can force this by editing NVCC_FLAGS in Makefile. Add something like "-arch sm_50" (or 52). But I doubt it will make much difference and it will only affect CUDA formats. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.