Date: Fri, 14 Aug 2020 20:07:26 -0600 From: Ariadne Conill <ariadne@...eferenced.org> To: musl@...ts.openwall.com Subject: Re: Restrictions on child context after multithreaded fork Hello, On 2020-08-14 15:41, Rich Felker wrote: > musl 1.2.1 has exposed bugs in several applications and libraries > caused by async-signal-unsafe code between (multithreaded) fork and > subsequent exec. So far, dbus library code, pulseaudio library code, > and libvirt have been found to be affected. A couple of the bug > reports (with incomplete information) are: I suspect the dbus library code has been broken for some amount of time because we've been tracking an issue where dbus stalls on KDE Wayland for a short period of time causing the session to come up strangely. 1.2.1 did make this stall more pronounced though. > https://gitlab.alpinelinux.org/alpine/aports/-/issues/11602 > https://gitlab.alpinelinux.org/alpine/aports/-/issues/11815 > > Fixing the affected library code looks very straightforward; it's just > a matter of doing proper iterations of existing data/state rather than > allocating lists and using opendir on procfs and such. I've discussed > fixes with Alpine Linux folks and I believe fixes have been tested, > but I don't see any patches in aports yet. I'll work on getting patches in next week. > I've seen suspicions that the switch to mallocng exposed this, but I'm > pretty sure it was: > > https://git.musl-libc.org/cgit/musl/commit/?id=e01b5939b38aea5ecbe41670643199825874b26c We carried this patch before upgrade to 1.2.1 and it was mostly fine, has been carried since 1.1.24-r7 AFAIK. > Before this commit, the (incorrect) lock skipping logic allowed the > child process to access inconsistent state left from the parent if it > violated the requirement not to call AS-unsafe functions. Now, the > lock attempt in the child rightly deadlocks before accessing state > that was being modified under control of the lock in the parent. This > is not specific to malloc but common with anything using libc-internal > locks. > > I'll follow up on this thread once there are patches for the known > affected libraries. > > Note that this is a type of bug that's possibly hard to get upstreams > to take seriously. libvirt in particular, despite having multiple > comments throughout the source warning developers that they can't do > anything AS-unsafe between fork and exec, is somehow deeming malloc an > exception to that rule because they want to use it (despite it clearly > not being necessary). > > And the dbus issue has been known for a long time; see open bug: > > https://gitlab.freedesktop.org/dbus/dbus/-/issues/173 > (originally: https://bugs.freedesktop.org/show_bug.cgi?id=100843) > > This is largely because glibc attempts to make the erroneous usage by > these libraries work (more on that below). > > The next issue of POSIX (Issue 8) will drop the requirement that fork > be AS-safe, as a result of Austin Group tracker issue #62. This makes > the glibc behavior permissible/conforming, but there does not seem to > be any effort on the POSIX side to drop the requirement on > applications not to do AS-unsafe things in the child before exec, so > regardless of this change, what these libraries are doing is still > wrong. > > In order to make the child environment unrestricted after fork, either > fork must hold *all* locks at the time the actual fork syscall takes > place, or it must be able to reset any state protected by a lock that > was held in the parent (or some mix of the two). It's fundamentally > impossible to do this completely (in a way that lets the child run > unrestricted), since some locks in the parent may be held arbitrarily > long such that fork waiting on them would deadlock. In particular, any > stdio FILE lock may be held indefinitely because there's a blocking > operation in progress on the underlying fd, or because the application > has called flockfile. Thus, at best, the implementation can give the > child an environment where fflush(0) and exit() still deadlock. > > In case we do want to follow a direction of trying to provide some > degree of relaxation of restrictions on the child (taking the liberty > of POSIX-future drop of fork's AS-safety requirement), I did a quick > survey of libc-internal locks, and found: I think it is better to fix programs to not depend on AS-unsafe functions at fork time. Being lax on this requirement is an indicator of other bad engineering decisions in these programs, especially libvirt and pulseaudio. Ariadne
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.