Date: Sat, 11 Apr 2015 03:11:43 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: [GSoC] John the Ripper support for PHC finalists On Sat, Apr 11, 2015 at 01:54:47AM +0200, Agnieszka Bielec wrote: > 2015-04-11 1:34 GMT+02:00 Solar Designer <solar@...nwall.com>: > >> I've added SSE2 and isn't faster (bleeding-jumbo) > > > > This is unexpected. Are you sure the SSE2 (actually AVX, when building > > with AVX enabled) code is getting compiled in? And the non-SSE2 code > > isn't getting compiled in? > > I put printf() into POMELO_SSE2() I suspect it might be something like you running all 32 threads on super when there's some other load on the machine (I've just seen another user run some lengthy job), or/and forgetting to set GOMP_CPU_AFFINITY=0-31 Or maybe super simply bumps into its memory bandwidth with POMELO even when running the non-SSE2 version, at 32 threads. I suggest that you benchmark with lower OMP_NUM_THREADS, starting with 1, and increase the thread count slowly. Maybe you'd see a speed difference at low thread counts, but not so much at higher thread counts. Can you please post shell commands & output of how you benchmark the old non-SSE2 vs. the new SSE2 code? Thanks, Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.