Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 8 Nov 2020 17:12:15 +0100
From: Szabolcs Nagy <nsz@...t70.net>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: [PATCH v2] MT fork

* Rich Felker <dalias@...c.org> [2020-11-05 22:36:17 -0500]:
> On Fri, Oct 30, 2020 at 11:31:17PM -0400, Rich Felker wrote:
> > One thing I know is potentially problematic is interaction with malloc
> > replacement -- locking of any of the subsystems locked at fork time
> > necessarily takes place after application atfork handlers, so if the
> > malloc replacement registers atfork handlers (as many do), it could
> > deadlock. I'm exploring whether malloc use in these systems can be
> > eliminated. A few are almost-surely better just using direct mmap
> > anyway, but for some it's borderline. I'll have a better idea sometime
> > in the next few days.
> 
> OK, here's a summary of the affected locks (where there's a lock order
> conflict between them and application-replaced malloc):

if malloc replacements take internal locks in atfork
handlers then it's not just libc internal locks that
can cause problems but locks taken by other atfork
handlers that were registered before the malloc one.

i don't think there is a clean solution to this. i
think using mmap is ugly (uless there is some reason
to prefer that other than the atfork issue).

> 
> - atexit: uses calloc to allocate more handler slots of the builtin 32
>   are exhausted. Could reasonably be changed to just mmap a whole page
>   of slots in this case.
> 
> - dlerror: the lock is just for a queue of buffers to be freed on
>   future calls, since they can't be freed at thread exit time because
>   the calling context (thread that's "already exited") is not valid to
>   call application code, and malloc might be replaced. one plausible
>   solution here is getting rid of the free queue hack (and thus the
>   lock) entirely and instead calling libc's malloc/free via dlsym
>   rather than using the potentially-replaced symbol. but this would
>   not work for static linking (same dlerror is used even though dlopen
>   always fails; in future it may work) so it's probably not a good
>   approach. mmap is really not a good option here because it's
>   excessive mem usage. It's probably possible to just repeatedly
>   unlock/relock around performing each free so that only one lock is
>   held at once.
> 
> - gettext: bindtextdomain calls calloc while holding the lock on list
>   of bindings. It could drop the lock, allocate, retake it, recheck
>   for an existing binding, and free in that case, but this is
>   undesirable because it introduces a dependency on free in
>   static-linked programs. Otherwise all memory gettext allocates is
>   permanent. Because of this we could just mmap an area and bump
>   allocate it, but that's wasteful because most programs will only use
>   one tiny binding. We could also just leak on the rare possibility of
>   concurrent binding allocations; the number of such leaks is bounded
>   by nthreads*ndomains, and we could make it just nthreads by keeping
>   and reusing abandoned ones.
> 
> - sem_open: a one-time calloc of global semtab takes place with the
>   lock held. On 32-bit archs this table is exactly 4k; on 64-bit it's
>   6k. So it seems very reasonable to just mmap instead of calloc.
> 
> - timezone: The tz core allocates memory to remember the last-seen
>   value of TZ env var to react if it changes. Normally it's small, so
>   perhaps we could just use a small (e.g. 32 byte) static buffer and
>   replace it with a whole mmapped page if a value too large for that
>   is seen.
> 
> Also, somehow I failed to find one of the important locks MT-fork
> needs to be taking: locale_map.c has a lock for the records of mapped
> locales. Allocation also takes place with it held, and for the same
> reason as gettext it really shouldn't be changed to allocate
> differently. It could possibly do the allocation without the lock held
> though and leak it (or save it for reuse later if needed) when another
> thread races to load the locale.

yeah this sounds problematic.

if malloc interposers want to do something around fork
then libc may need to expose some better api than atfork.

> 
> So anywya, it looks like there's some nontrivial work to be done here
> in order to make the MT-fork not be a regression for replaced-malloc
> uage... :( Ideas for the above and keeping the solutions non-invasive
> will be very much welcome!
> 
> Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.