Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Tue, 28 May 2013 20:24:07 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: OMP for raw formats

On 28 May, 2013, at 4:06 , Solar Designer <solar@...nwall.com> wrote:
> On Mon, May 27, 2013 at 05:08:54PM -0400, jfoug@....net wrote:
>> So if we 8 threads, and 1024 items, it will start 8 threads, and give each thread 129 items to work on (last thread will only get 121 items).  Thus, there is no thread switching.  That should help.
> 
> No, it should not.  A sane OpenMP implementation already does it (and
> each thread gets exactly 128 items).  This corresponds to
> schedule(static) or maybe schedule(static, 128).  Other alternatives,
> which may be better for things such as SunMD5, are schedule(dynamic) and
> schedule(guided).  You may experiment with these.  The default may vary
> between OpenMP implementations, and it may be overridden via the
> OMP_SCHEDULE environment variable.

I tried schedule(static) as well as schedule(static, OMP_SCALE) but both actually gave worse results.

I then tried tuning OMP_SCALE on that very AMD six-core instead (using no schedule()) and saw a sweet spot at 96. But that still did not give any gain for more cores, it just mitigated the regression when using one core with an OMP build. Also, using 96 would likely produce very poor results on newer AMD or Intel cpus (who use 1024 or 2048).

I fooled around with some other changes but saw nothing that made it better. Should we revert to some kind of cpuid function? Maybe a shared one in misc.c that all formats can use and that can be used to change the function pointer to crypt_all() or call omp_set_num_threads(1).

magnum


Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ