Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 14 Jan 2016 23:59:52 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: dlopen deadlock

On Thu, Jan 14, 2016 at 09:47:59PM -0500, Rich Felker wrote:
> > > > one solution i can think of is to have an init_fini_lock
> > > > for each dso, then the deadlock only happens if a ctor
> > > > tries to dlopen its own lib (directly or indirectly)
> > > > which is nonsense (the library depends on itself being
> > > > loaded)
> > > 
> > > The lock has to protect the fini chain linked list (used to control
> > > order of dtors) so I don't think having it be per-dso is a
> > > possibility.
> > > 
> > 
> > i guess using lockfree atomics could solve the deadlock then
> 
> I don't think atomics help. We could achieve the same thing as atomics
> by just taking and releasing the lock on each iteration when modifying
> the lock-protected state, but not holding the lock while calling the
> actual ctors.
> 
> >From what I can see/remember, the reason I didn't write the code that
> way is that we don't want dlopen to return before all ctors have run
> -- or at least started running, in the case of a recursive call to
> dlopen. If the lock were taken/released on each iteration, two threads
> simultaneously calling dlopen on the same library libA that depends on
> libB could each run A's ctors and B's ctors and either of them could
> return from dlopen before the other finished, resulting in library
> code running without its ctors having finished.
> 
> The problem is that excluding multiple threads from running
> possibly-unrelated ctors simultaneously is wrong, and marking a
> library constructed as soon as its ctors start is also wrong (at least
> once this big-hammer lock is fixed). Instead we should be doing some
> sort of proper dependency-graph tracking and ensuring that a dlopen
> cannot return until all dependencies have completed their ctors,
> except in the special case of recursion, in which case it's acceptable
> for libX's ctors to load a libY that depends on libX, where libX
> should be treated as "already constructed" (it's a bug in libX if it
> has not already completed any initialization that libY might depend
> on). However I don't see any reasonable way to track this kind of
> relationship when it happens 'indirectly-recursively' via a new
> thread. It may just be that such a case should deadlock. However,
> dlopen of separate libs which are unrelated in any dependency sense to
> the caller should _not_ deadlock just because it happens from a thread
> created by a ctor...

Some relevant history:

commit f4f77c068f1058d202a976678fce2617d59c0ff6
fix/improve shared library ctor/dtor handling, allow recursive dlopen

commit 509b50eda8ea7d4a28f738e4cf8ea98d25959f00
fix missing synchronization in calls from dynamic linker to global ctors

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.