![]() |
|
Message-ID: <20250527152007.GO1827@brightrain.aerifal.cx> Date: Tue, 27 May 2025 11:20:07 -0400 From: Rich Felker <dalias@...c.org> To: Markus Wichmann <nullplan@....net> Cc: musl@...ts.openwall.com Subject: Re: Deadlock in dynamic linker? On Sat, May 24, 2025 at 07:45:45AM +0200, Markus Wichmann wrote: > Hi all, > > I have a question about the handling of shutting_down in the dynamic > linker. Namely, I saw that do_init_fini() will go into an infinite wait > loop if it is set. The idea was probably to park initializing threads > while the system is shutting down, but can't this lead to a deadlock > situation? The idea is to prevent addition of any further ctors once dtors have already started, since this may (?) make it difficult to ensure the dtors are executed in reverse order of ctors, and since any added after the entire list is processed would have their dtors skipped entirely (see analogous logic in atexit). > I'm thinking something like this: Thread A initializes liba.so. liba.so > has initializers and finalizers, so thread A adds liba.so to the fini > list before calling the initializers. The liba initializer calls > dlopen("libb.so"). libb.so also has initializers. > > While thread A is not holding the init_fini_lock, thread B calls exit(). > That progresses until __libc_exit_fini() sets shutting_down to 1. Then > it tries to destroy all the libraries, but the loop stops when it comes > to liba. > > liba.so has a ctor_visitor, namely thread A, so thread B cannot advance. > Thread A meanwhile is hanging in the infinite wait loop trying to > initialize libb.so. The situation cannot change, and the process hangs > indefinitely. I see. In particular you're assuming the dlopen of libb happened after the exit started. > A simple way out of this pickle could be to add liba.so to the fini list > only after it was initialized. That way, thread B cannot hang on it, or > more generally, the finalizing thread cannot be halted by an incomplete > initialization in another thread. This might change the order of nodes > on the fini list, but only to account for dynamic dependencies. Isn't > that a good thing? No, I think it's non-conforming, and also unsafe, as it can result in failure to run a dtor for something whose ctor already ran but did not finish. This is a worse outcome than a deadlock in a situation that's arguably undefined to begin with. What might be acceptable, though, is moving the setting of shutting_down to take place after the last dtor is peeled off the list. However, this probably requires splitting shutting_down into two variables, due to lock order issues. The value is needed under the global ldso lock in dlopen() to make dlopen return with an error if exit has already begun (this one should be kept before the dtor loop, I think), and the value is needed in do_init_fini to block execution of new ctors (this one should only take effect after all dtors have been run). Does that sound right? Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.