Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 11 Apr 2015 03:11:43 +0300
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: [GSoC] John the Ripper support for PHC finalists

On Sat, Apr 11, 2015 at 01:54:47AM +0200, Agnieszka Bielec wrote:
> 2015-04-11 1:34 GMT+02:00 Solar Designer <solar@...nwall.com>:
> >> I've added SSE2 and isn't faster (bleeding-jumbo)
> >
> > This is unexpected.  Are you sure the SSE2 (actually AVX, when building
> > with AVX enabled) code is getting compiled in?  And the non-SSE2 code
> > isn't getting compiled in?
> 
> I put printf() into POMELO_SSE2()

I suspect it might be something like you running all 32 threads on super
when there's some other load on the machine (I've just seen another user
run some lengthy job), or/and forgetting to set GOMP_CPU_AFFINITY=0-31

Or maybe super simply bumps into its memory bandwidth with POMELO even
when running the non-SSE2 version, at 32 threads.

I suggest that you benchmark with lower OMP_NUM_THREADS, starting with 1,
and increase the thread count slowly.  Maybe you'd see a speed
difference at low thread counts, but not so much at higher thread counts.

Can you please post shell commands & output of how you benchmark the old
non-SSE2 vs. the new SSE2 code?

Thanks,

Alexander

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ