Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Fri, 5 Aug 2022 21:28:38 -0400
From: Keno Fischer <keno@...iacomputing.com>
To: libc-coord@...ts.openwall.com
Subject: Re: Proposing dl* extensions with explicit caller specification

Hi James,

Thanks for your detailed thoughts.

On Fri, Aug 5, 2022 at 6:57 PM James Y Knight <jyknight@...gle.com> wrote:
> That way, you're only ever intercepting dlopen from within the same
> shared object that the call was made from, avoiding the problem -- and
> the requirement for these new libc functions. Could that work?

I considered this, but ultimately decided it wasn't a sufficient
solution, for two
reasons.

1. It doesn't apply to symbol interposition by tools that do not have compiler
instrumentation available. At least one of the complaints was from libTAS [1],
which does not. Similarly the rr interceptor (which does currently happen to
work because of the tail call, but may not necessarily be able to be a tail call
in the future) assumed a completely unmodified binary.

2. It doesn't even necessarily apply to asan. One of the critical
features of asan
over msan is that it can (with some restrictions) be used incrementally, making
it easy to try out and adopt without having to recompile the world
(since the world
usually includes things like the C++ standard library and various
system libraries).
Recompiling the world is a high end-user burden and one that I would
like to avoid
if possible.

> > take an explicit `dl_caller` pointer that is used in place of the return address
>
> The documentation for the new parameter needs to be expanded. I
> believe the intended semantics are that it may be set to _any_ address
> within the DSO's mapped memory regions, but you should say that
> explicitly. That is: passing a function return address is one
> possibility, but you could also pass any other address within the
> DSO's memory mappings (e.g. address of a symbol such as __dso_handle).

So, in my proposal, I somewhat deliberately limited it to the return
address case,
because that's the only case that really is required to work in the current
implementation. If there is a consensus among libc maintainers that any address
in the DSO would be appropriate, then I do indeed think that those semantics
would be better, so let's consider that the current proposal unless someone
objects ;).

> > dlsym_caller dlvsym_caller dlmopen_caller
>
> Beyond those 3 APIs, grepping glibc for uses of RETURN_ADDRESS I see
> there's also dl_iterate_phdr, _dl_mcount_wrapper,
> _dl_mcount_wrapper_check, and various malloc debugging things. The
> first, at least, seems relevant to this proposal -- presumably there
> should be a variant of `dl_iterate_phdr` added, too.

That's a good point, thanks for catching that. That said, perhaps we would want
to take this opportunity to be more explicit about the namespace dependency,
e.g. we could have:
```
dlm_iterate_phdr_from(Lmid_t lmid, const char *filename, int flags,
void *dl_caller)
```
so that it could be called with an explicit namespace ID (or LM_ID_CALLER).
If we can load shared libraries into an arbitrary namespace, it seems reasonable
to want to iterate them too.

[1] https://github.com/clementgallet/libTAS

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.