Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 9 May 2015 22:39:34 +0800
From: Lei Zhang <zhanglei.april@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Adding OpenMP support to SunMD5


> On May 9, 2015, at 7:31 PM, Solar Designer <solar@...nwall.com> wrote:
>> 
>> [lei@...er src]$ ../run/john --test --format=sunmd5
>> Will run 32 OpenMP threads
>> Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (32xOMP) DONE
>> Speed for cost 1 (iteration count) of 5000
>> Raw:	4954 c/s real, 162 c/s virtual
>> 
>> Hints?
> 
> Please try the hints from super's /etc/motd, in particular "export
> GOMP_CPU_AFFINITY=0-31"  Does it help?

Yes, it helps!

[lei@...er src]$ GOMP_CPU_AFFINITY=0-31 ../run/john --test --format=sunmd5
Will run 32 OpenMP threads
Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (32xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	7372 c/s real, 231 c/s virtual

I also tested it with my previous version, where threadprivate is used.

[lei@...er src]$ GOMP_CPU_AFFINITY=0-31 ../run/john --test --format=sunmd5
Will run 32 OpenMP threads
Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (32xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	8302 c/s real, 259 c/s virtual

The newer version is still slower.


> Other than that, it is possible that you ran into false sharing.  Having

> a gap between the different threads' data structures can be beneficial.

I'm not sure about that. I just multiply the size of the original arrays by the number of threads. There should be no cross-referencing among threads.

Dynamic arrays have slower accessing than static arrays. Could that be the reason of performance degeneration?


Lei

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.