Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 10 Apr 2018 14:39:03 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: tcmalloc compatibility

On Tue, Apr 10, 2018 at 07:53:46PM +0200, Florian Weimer wrote:
> * Rich Felker:
> 
> > malloc interposition is undefined behavior (as is any interposition of
> > standard functions), and is very difficult to actually support as an
> > extension in a way that doesn't have lots of serious problems. This
> > has been discussed before but I don't have links handy. I'll try to
> > dig them up later. The glibc folks are also aware that it's broken (on
> > glibc, it only works if you get lucky and follow unwritten rules)
> 
> We have some documentation nowadays:
> 
> <https://www.gnu.org/software/libc/manual/html_node/Replacing-malloc.html>

Thanks. This is useful information. I'm a bit concerned about the
status of memalign, etc. though as described there. If malloc is
replaced but the memalign family is not, calls to whichever function
is not replaced will either use the system malloc (producing a result
that's unsafe to pass to free) or will call the interposed malloc then
assume they can doctor its book-keeping structures as if they matched
the system malloc. Neither of these is safe.

In musl, what we should probably do is have extra weak+hidden aliases,
so that memalign can do something like:

	if (malloc != __internal_malloc) return 0;

Maybe something similar would be appropriate in glibc?

> The remaining undocumented aspects concern cyclic dependencies, such
> as the suitability of certain TLS models for implementing a custom
> malloc, or using memory-allocating glibc functions such as fopen or
> backtrace from the allocator itself.
> 
> In practice,

The "in practice" here imposes strong constraints on implementation
internals -- things could stop working at any time in the future if
changes are made to ways/places in which malloc can be called.
This could especially bite glibc if further alloca/large stack usage
is replaced with malloc.

> malloc interposition works extremely well and causes few
> issues due to interposition itself.  Obviously, there are bugs, but
> most of them would still be bugs if the allocator was non-interposing.
> (Examples are lots of initial-exec TLS data, and incorrect alignment
> for allocations.)
> 
> I believe musl uses less malloc internally, so it should be even more
> compatible with an interposing malloc implementation than glibc.

Indeed. But while musl uses less malloc, it also tries to be more
opaque and provide fewer guarantees about implementation internals,
leaving more free to change.

There are some functions that musl makes AS-safe, beyond what POSIX
requires, where I intended for the safety to actually be a public
property, and I think those would be a good basis for what's allowed
to be called from a malloc implementation, if we actually take the
effort to document them. This would include a lot of useful things
like dprintf, snprintf, mutex lock and (non-robust-type) unlock,
semaphore wait (post is already AS-safe by POSIX), ...

Unfortunately it doesn't even include mmap/munmap right now, due to
tricks we're doing to work around the unsafety (inherent race
conditions) in Linux's robust mutex support (I think glibc bug #14485
is related). I don't think this actually affects malloc though, and
mmap/munmap could just be explicitly allowed for use in malloc even
without documenting them as AS-safe.

I think allowing any use of stdio (meaning functions that actually
operate on FILEs, not counting snprintf, etc.) from malloc is a bad
idea. Some functions (getline/getdelim) use malloc and some FILE types
will necessarily (memstream) or potentially (cookie) involve malloc in
potentially complex ways.

A more interesting question is whether pthread functions, especially
pthread_create (e.g. background thread managing migration between
heaps) are permissible. They seem useful but also hard to guarantee
safe to use in this context.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.