Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 7 Mar 2017 17:02:09 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Reviving planned ldso changes

On Sun, Mar 05, 2017 at 08:11:59PM -0500, Rich Felker wrote:
> On Sat, Mar 04, 2017 at 11:58:18AM +0100, Szabolcs Nagy wrote:
> > * Rich Felker <dalias@...c.org> [2017-03-02 20:30:26 -0500]:
> > > Here's a v4 of the patch that saves the "init parent" we descended
> > > from so that it can return where it left off. There are a couple
> > > gratuitous hunks left over adding setting of "needed_by" where it made
> > > sense to be set, but it's not actually used anymore. They could be
> > > dropped if desired but are probably nice to keep for the sake of
> > > consistency of data, even thoough it's data we don't use.
> > > 
> > > I believe this can be extended to allow concurrent dlopen by amending
> > > the case in the tree-walk where a dependency isn't constructed yet but
> > > already has an "init parent" to check whether it's
> > > pending-construction in the calling thread (recursive dlopen from a
> > > ctor) or another thread; in the former case (as now) treat it as
> > > already-constructed; in the latter, wait on a condvar that gets
> > > signaled at the end of each construction, then continue the loop
> > > without advancing p. There are probably some subtleties I'm missing,
> > > though.
> > ....
> > >  static void do_init_fini(struct dso *p)
> > >  {
> > >  	size_t dyn[DYN_CNT];
> > > -	int need_locking = libc.threads_minus_1;
> > > -	/* Allow recursive calls that arise when a library calls
> > > -	 * dlopen from one of its constructors, but block any
> > > -	 * other threads until all ctors have finished. */
> > > -	if (need_locking) pthread_mutex_lock(&init_fini_lock);
> > > -	for (; p; p=p->prev) {
> > > -		if (p->constructed) continue;
> > > +	pthread_mutex_lock(&init_fini_lock);
> > > +	/* Construct in dependency order without any recursive state. */
> > > +	while (p && !p->constructed) {
> > > +		/* The following loop descends into the first dependency
> > > +		 * that is neither alredy constructed nor pending
> > > +		 * construction due to circular deps, stopping only
> > > +		 * when it reaches a dso with no remaining dependencies
> > > +		 * to descend into. */
> > > +		while (p->deps && p->deps[p->next_dep]) {
> > > +			if (!p->deps[p->next_dep]->constructed &&
> > > +			    !p->deps[p->next_dep]->init_parent) {
> > > +				p->deps[p->next_dep]->init_parent = p;
> > > +				p = p->deps[p->next_dep++];
> > 
> > i think the root may be visited twice because it
> > has no init_parent, which may be problematic with
> > the concurrent dlopen (and can cause unexpected
> > ctor order: the root node is not constructed last
> > if there is a cycle through it)
> 
> Ah, the case where the root is an indirect dependency for itself? Yes,
> I think you're right in that case. We should be able to avoid it by
> setting the initial p->init_parent to head (the application), I think.
> 
> > i think only checking init_parent of a dep is
> > enough and the root node can have a dummy parent
> > that is guaranteed to be not a dependency (ldso?)
> > and constructed so it stops the loop.
> 
> I think ldso would work too, but in principle it need not be a
> dependency of anything if you have a dynamic-linked program that
> doesn't use libc (-nostdlib), so it's better to use head, I think.
> 
> Also I agree we don't need to check p->constructed now, but once we
> unlock during ctor execution, the !init_parent and !constructed cases
> need to be treated separately. If it's constructed or pending
> construction in the same thread, we can just do p->next_dep++, but if
> it has an init_parent but isn't constructed or pending construction in
> same thread (recursive) we need to condvar wait and re-check instead,
> right?

Arg, deep problems I missed. Quoting from IRC:

<dalias> nsz, uhg, the dep-order draft so far has a big bug
<dalias> p->deps is not actually deps for p
<dalias> rather, it's all indirect deps, but only set for a lib that was explicitly dlopen'd
<dalias> so the new code doesn't actually do dep-order
<dalias> it just walks a flat list of all (breadth-first, not depth-first) direct and indirect dependencies of p
<dalias> and descends into each then immediately backs out
<dalias> because after descending, p->deps is null
<dalias> i think we should get rid of the old use of p->deps
<dalias> which is just undoing temp-globalization of libs during load for reloc purposes

If I first do the work of having a separate global-namespace dso list
(which is a pending change that will speed up relocations anyway),
then the old use of p->deps is no longer needed and we can simply
repurpose it to be direct-deps only.

An alternate idea I like a lot, but don't see how to make work with
concurrency (making it so ctors still running in one thread don't
unnecessarily preclude a new dlopen in another thread), is to build a
queue for ctor execution at the time dependencies are loaded. The
logic for this seems simple: when processing the DT_NEEDED entry,
check if the needed library is already constructed or already in the
ctor queue. If so, do nothing. If not, insert it in the queue just
before the library it's a dependency for.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.