Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Mon, 25 Aug 2014 23:43:21 -0400
From: Rich Felker <>
Subject: Multi-threaded performance progress

This release cycle looks like it's going to be huge for multi-threaded
performance issues. So far the cumulative improvement on my main
development system, as measured by the cond_bench.c by Timo Teräs, is
from ~250k signals in 2 seconds up to ~3.7M signals in 2 seconds.
That's comparable to what glibc gets on similar hardware with a cond
var implementation that's much less correct. The improvements are a
result of adding private futex support, redesigning the cond var
implementation, and improvements to the spin-before-futex-wait

Semaphore performance has also improved, up from fewer than 500k
wait/post operations to ~12M, mostly due to spin-before-futex-wait.

The above results are all based on micro-benchmarks which are
potentially meaningless to real-world applications, so I'd be
interested in seeing any higher-level or real-application-based
comparisons of the old and new code.

There is one remaining performance issue I still want to look into
fixing, possibly during this release cycle: when a thread repeatedly
takes and releases a lock on which other threads are waiting, it makes
a futex wake syscall on each unlock, despite only the first one being
necessary. I have a design for avoiding this on internal locks, but
it's less obvious how to do it for mutexes where storage is tight and
self-synchronized destruction is possible.

We're near the end of my planned time frame for this release cycle,
but I'm still interested in working with Jens to get C11 threads into
this release if possible, so I'll probably extend it for a while


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.