Date: Tue, 26 Aug 2014 23:15:56 +0200 From: Jens Gustedt <jens.gustedt@...ia.fr> To: Rich Felker <dalias@...c.org> Cc: musl@...ts.openwall.com Subject: Re: Multi-threaded performance progress Am Dienstag, den 26.08.2014, 16:26 -0400 schrieb Rich Felker: > On Tue, Aug 26, 2014 at 09:34:13PM +0200, Jens Gustedt wrote: > > Am Dienstag, den 26.08.2014, 15:05 -0400 schrieb Rich Felker: > > > On Tue, Aug 26, 2014 at 08:30:39PM +0200, Jens Gustedt wrote: > > > > Or do you mean that I should use an atomic store at the other end? > > > > > > Yes. With an atomic store at the other end, I think it could be > > > correct, but I'd need to review it further to be sure. > > > > ok, it shouldn't be difficult to use atomic ops, then. > > Based on what you've said below, though, I think there's still a big > problem.. I am not sure that I follow. > > > > > Note that the performance of the code in which you're trying to avoid > > > > > the lock does not matter in the slightest except when a race happens > > > > > between a thread acting on cancellation or timeout and a signaler > > > > > (since that's the only time it runs). I expect this is extremely rare, > > > > > so unless we determine otherwise, I'd rather not add complexity here. > > > > > > > > If we have a broadcaster working a long list of waiters, this might > > > > still happen sufficiently often. And the "complexity" is hidden in the > > > > execution pattern of the current version, where control and handling > > > > of the list alternates between different threads, potentially as many > > > > times as there are waiters in the list. > > > > > > Does your code eliminate that all the time? If so it's more > > > interesting. I'll look at it more. > > > > Yes, it will be the signaling or broadcasting thread that will be > > working on the integrity of the list while it is holding the lock. At > > the end those that it detected to be leaving will be isolated list > > items, those that are to be signaled form one consecutive list that is > > detached from the cv, and the ones that remain (in the signaling case) > > form a valid cv-with-list. > > > > The only small gap that remains (and that annoys me) is the leaving > > thread that sneaks in > > > > - marks itself as leaving before the end of the the CS > > - only asks for _c_lock *after* the signaling thread has left its CS > > > > This is all our problem of late access to the cv variable revisited, > > but here it is condensed in a very narrow time frame. Both threads > > must be active for this to happen, so my hope is that when both are > > spinning for some time on the c_lock for the waiter and on ref for > > the signaler, none of them will "ever" be forced into a futex wait. > > That's a bug thast needs to be fixed to go forward with this, since > it's essentially a use-after-free. Now that you mention it, avoiding > use-after-free was one of my main motivations for having such waiters > synchronize with the signaling thread. That should have been > documented in a comment somewhere, but the point seems to have slipped > my mind sometime between the design phase and writing the code and > comments. here I don't follow either. What I describe is an inconvenience that forces all these mechanics with the signaller still waiting for the ref count to fall to 0 before he can leave and abandon the cv. As far as I can see that should still be working with both versions. > Do you see any solution whereby a waiter that's waking up can know > reliably, without accessing the cv, whether a signaling thread is > there to take responsibility for removing it from the list? No. My intent with the description above was to illustrate where that problem resides in the version that I proposed. > I'm not seeing any solution to that problem. You mean in the code that I proposed? No there is no other solution, than what you have in your original version: have the signalling thread wait for all the leaving waiters to come in. In your version that write to node->notify is protected by _c_lock, so this is not subject to races. In my version this is the same, unless the lockRace variant gives the leaving thread some additional information. If this information about ref arrives in time before the _c_lock is taken it is used, and the leaving thread never touches the cv, again. If it doesn't arrive in time the _c_lock is waited for, then taken, the notifier is re-evaluated, and all is fine. (__wait should be a memory barrier, shouldn't it?) But you are right I should add atomicity, there. > I'm also still skeptical that there's a problem to be solved here; for > it to matter, the incidence of such races needs to be pretty high, I > think. "matter" can mean different things. I don't think that it my version penalizes performance (but sure a proof would be better), it could also give some small benefits concerning execution time, it could be mildly better by avoiding cache-ping-pong and by avoiding some futex calls. For me, it matters because the handling is "simpler" (a proof would be much, much better, here :) in that there is one single thread manipulating the list. So I should work more on prooving that point by keeping it much closer to your original version. Ok, I'll try to switch to one-step-at-a-time-mode. > Perhaps if you think it's going to matter you could work on a > test case that shows performance problems under load with lots of > timedwait expirations (or cancellations, but I think worry about > cancellation performance is somewhat silly to begin with). Yes, since we have eliminated spurious wakeups ;) expirations should be thing to worry about. > Or, if you don't have time to spend on side projects like test > cases, maybe someone else could test it? Not at the moment. I would prefer to have a C thread implementation by the end of the week that we could agree upon. Jens -- :: INRIA Nancy Grand Est ::: AlGorille ::: ICube/ICPS ::: :: ::::::::::::::: office Strasbourg : +33 368854536 :: :: :::::::::::::::::::::: gsm France : +33 651400183 :: :: ::::::::::::::: gsm international : +49 15737185122 :: :: http://icube-icps.unistra.fr/index.php/Jens_Gustedt :: Download attachment "signature.asc" of type "application/pgp-signature" (199 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.