Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 11 May 2015 23:16:02 +0200
From: Frank Dittrich <frank.dittrich@...lbox.org>
To: john-dev@...ts.openwall.com
Subject: Re: Adding OpenMP support to SunMD5

On 05/11/2015 11:06 PM, magnum wrote:
> On 2015-05-11 19:58, Frank Dittrich wrote:
>> -#define OMP_SCALE 1
>> +#define OMP_SCALE 8
[...]
>> $ ../run/john --test=10 --format=sunmd5
>> Will run 32 OpenMP threads
>> Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (32xOMP) DONE
>> Speed for cost 1 (iteration count) of 5000
>> Raw:    9990 c/s real, 312 c/s virtual
> 
> On my core i7 laptop, OMP_SCALE 4 is best, HT or not. Bumping to 8
> slightly degrades HT but does not change non-HT at all. This is with 4:
> 
> $ OMP_NUM_THREADS=4 ../run/john -test -form:sunmd5 && ../run/john -test
> -form:sunmd5
> Will run 4 OpenMP threads
> Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (4xOMP) DONE
> Speed for cost 1 (iteration count) of 5000
> Raw:    2497 c/s real, 629 c/s virtual
> 
> Will run 8 OpenMP threads
> Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (8xOMP) DONE
> Speed for cost 1 (iteration count) of 5000
> Raw:    2671 c/s real, 345 c/s virtual

I didn't try 4 on super.
I just took a wild guess, then tried 16, then saw that Solar was active
on super, and didn't want to interfere.
So, OMP_SCALE < 8 might be better on super.

These are tests on Super, all with OMP_SCALE 4:


$ ../run/john --test=10 --format=sunmd5
Will run 32 OpenMP threads
Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (32xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	8130 c/s real, 291 c/s virtual

[frank@...er src]$ ../run/john --test=2 --format=sunmd5
Will run 32 OpenMP threads
Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (32xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	8474 c/s real, 291 c/s virtual

[frank@...er src]$ ../run/john --test=2 --format=sunmd5
Will run 32 OpenMP threads
Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (32xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	9909 c/s real, 310 c/s virtual

[frank@...er src]$ ../run/john --test=2 --format=sunmd5
Will run 32 OpenMP threads
Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (32xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	8246 c/s real, 291 c/s virtual


These are with OMP_SCALE 8:

[frank@...er src]$ ../run/john --test=2 --format=sunmd5
Will run 32 OpenMP threads
Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (32xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	8302 c/s real, 310 c/s virtual

[frank@...er src]$ ../run/john --test=2 --format=sunmd5
Will run 32 OpenMP threads
Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (32xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	8005 c/s real, 331 c/s virtual

[frank@...er src]$ ../run/john --test=10 --format=sunmd5
Will run 32 OpenMP threads
Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (32xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	9637 c/s real, 311 c/s virtual


The variation suggests there might be some other load.
But 4 is probably as good as 8.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.