Date: Mon, 12 Oct 2020 10:55:50 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: Calling setxid() in a vfork()-child On Mon, Oct 12, 2020 at 12:27:44PM +0300, Alexey Izbyshev wrote: > Hello, > > I'm investigating possibility of using vfork() instead of fork() in > a Linux-only application. Before calling execve(), the app might > need to call some functions to setup the child, including setxid() > (let's assume that security concerns of  are not applicable). I'm > aware that POSIX doesn't allow that for vfork()-children, but I'm > also aware that it might be OK on Linux if the set of functions is > sufficiently constrained, and that vfork() is used to efficiently > implement posix_spawn() in C libraries. However, setuid()/setgid() > seem particularly tricky because of the need to call the actual > syscall in all threads, so if a C library is unaware that setxid() > is called in a vfork()-child, it might attempt to interact with > threads of the parent process, potentially causing trouble. I've > checked musl and found a recent commit that fixes this exact > issue. I've also checked glibc, but haven't found any handling of > this case (and vfork() doesn't appear to do anything special in this > regard either). > > Do I understand correctly that, from an application developer > perspective, it's currently better to avoid setxid/setrlimit libc > functions in a vfork()-child, and that using syscall() or avoiding > vfork() entirely is preferred in this case? Really, avoiding vfork entirely is preferable. The traditional specification of vfork (before it was deprecated and removed from spec; POSIX has not had vfork for a *long* time) did not allow *anything* after vfork except execve or _exit, so arguably it's UB, although there's also some argument to be made that if we're implementing the nonstandard and traditional vfork function it should have most of the important traditional properties. Indeed as you found this is fixed in musl, largely because the failure mode was so egregiously bad. Note that in addition to the issue you're asking about, it's fundamentally a bad idea to be using set*id() in a vforked child (or anywhere in a process that calls vfork) because it leaves moments where there are tasks in different privilege domains executing from the same VM space. If the task that's dropped privileges does anything that could lead to an attacker seizing control of the flow of execution, rather than just getting access to the set*id()-reduced privilege domain, they have full access to the original privilege domain. This is why musl's multithreaded set*id() (__synccall) takes care not to admit forward progress of any application code during the transition, and goes to the trouble of having a thread list lock that unlocks atomically with kernel task exit so that there is no race window where a still-live thread can be missed. In any case, IMO unless you're programming for NOMMU compatibility, you should just forget vfork ever existed. There's no good reason to use it. If a process can't fork because it's too big or the fork would impact performance too much, posix_spawn can do far more than vfork+execve can do portably. It can't do everything you can do with vfork+execve if you're willing to break portability rules (i.e. invoke UB), but with a helper executable to run in the child you can get that all back. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.