Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 28 Feb 2015 02:21:22 +0300 (MSK)
From: Alexander Monakov <amonakov@...ras.ru>
To: musl@...ts.openwall.com
Subject: semaphore redesign

Hello,

As new cancellation has landed (except that __timedwait fails to propagate
ECANCELED in addition to ETIMEDOUT and EINTR), I'm happy to post semaphore
redesign.  We discussed this implementation with Rich on IRC once, and
presently I'm not aware of any issues (finally!), but still testing and
performance comparison need to be done.

This implementation aims to solve the problem with sem_getvalue that started
this thread, while also making fast (uncontended) paths leaner.  There were
some explanations scattered in the old thread; I'll try to provide a summary.

Conceptually, a semaphore has a non-negative value and when the value is zero,
a non-negative waiter count.  The new implementation stores the value minus
number of waiters in val[0], and in val[1] it stores the "wake count": the
number of waiters that were discounted from val[0] in sem_post; val[1] is
futex-woken in sem_post, and futex-waited on in sem_wait.

A thread that entered sem_wait and became a waiter by decrementing
non-positive val[0] may leave either by consuming a post (decrementing a
positive val[1]), or, if error condition such as timeout or cancellation has
been propagated from futex wait, by discounting itself as a waiter by
incrementing a negative val[0].  Conversely, if after encountering an error a
waiter observes non-negative val[0], it means that another thread already
discounted it from waiters when doing a sem_post, so the waiter proceeds to
consume the post and suppress the error.

Since any leaving waiter must either decrement positive val[1] or increment
negative val[0], accessing val[1] non-atomically with val[0] in sem_post does
not lead to potential use of destroyed semaphore.

At Rich's request, I'm posting this as plain C source code rather than a diff.
The implementation of sem_getvalue stays the same.

Alexander

#include <semaphore.h>
#include "pthread_impl.h"

int sem_post(sem_t *sem)
{
	int val;
	do {
		val = sem->__val[0];
		if (val == SEM_VALUE_MAX) {
			errno = EOVERFLOW;
			return -1;
		}
	} while (val != a_cas(sem->__val, val, val+1));
	if (val < 0) {
		int priv = sem->__val[2];
		a_inc(sem->__val+1);
		__wake(sem->__val+1, 1, priv);
	}
	return 0;
}

int sem_trywait(sem_t *sem)
{
	int val;
	do {
		val = sem->__val[0];
		if (val <= 0) {
			errno = EAGAIN;
			return -1;
		}
	} while (val != a_cas(sem->__val, val, val-1));
	return 0;
}

static void dummy(void *arg)
{
}

int sem_timedwait(sem_t *restrict sem, const struct timespec *restrict at)
{
	pthread_testcancel();

	// Do not spin if already contended (val0 is negative)
	for (int spins = 100; spins && sem->__val[0] == 0; spins--)
		a_spin();

	if (a_fetch_add(sem->__val, -1) > 0)
		return 0;

	int priv = sem->__val[2], e = 0, ee, cs;
	pthread_setcancelstate(PTHREAD_CANCEL_MASKED, &cs);

	do {
		ee = __timedwait(sem->__val+1, 0, CLOCK_REALTIME, at, dummy, 0, priv);
		int wak = sem->__val[1];
		if (wak > 0 && wak == a_cas(sem->__val+1, wak, wak-1))
			goto done;
	} while (!ee || ee == EINTR);

	// Upon returning from wait with an error, either discount ourselves as a waiter
	// by incrementing negative val0, and propagate error, or consume a racing post
	// if val0 is non-negative, and suppress error.
	for (;;) {  
		int val = sem->__val[0];
		if (val < 0 && val == a_cas(sem->__val, val, val+1))
			break;
		int wak = sem->__val[1];
		if (wak > 0 && wak == a_cas(sem->__val+1, wak, wak-1))
			goto done;
		a_spin();
	}
	e = ee;
done:
	pthread_setcancelstate(cs, 0);

	if (!e) return 0;

	if (e == ECANCELED) {
		pthread_testcancel();
		pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, 0);
	}

	errno = e;
	return -1;
}

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.