Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 19 Oct 2017 21:17:16 -0400
From: Rich Felker <>
Subject: posix_spawn topics [was: Re: musl 1.1.17 released]

On Fri, Oct 20, 2017 at 02:11:35AM +0200, Petr Skocik wrote:
> Hi.
> Since posix_spawn is being discussed, I'd like to report a what I think is
> a bug in the current implementation:
> Since the child is unmasking its signals only after it's performed its file
> actions, the process may get blocked indefinitely (e.g., on a FIFO
> open) while being unresponsive to signals.
> Example program (with close to no error checking):
> #define _GNU_SOURCE
> #include <unistd.h>
> #include <spawn.h>
> #include <stdlib.h>
> #include <fcntl.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <sys/wait.h>
> int main()
> {
> pid_t pid;
> mkfifo("fifo", 0640);
> posix_spawn_file_actions_t fa;
> posix_spawn_file_actions_init(&fa);
> posix_spawn_file_actions_addopen(&fa, 0, "fifo", O_RDONLY, 0);
> posix_spawnp(&pid, "tr", &fa, 0, (char *const[]){ "tr", "a-z", "A-Z", 0},
> environ);
>         //will get stuck here
> }
> It  think the pthread_mask call should be moved above the file actions,
> which fixes this problem.

It's actually rather ugly that this happens at all. Your "fix" is
presumably that ^C now works, which of course helps. But short of
terminals signals there's no legitimate way to kill the hung process
because its pid has not been returned yet.

I'm not really sure what should happen here; it may call for an
interpretation request.

> Additionally, as far as extensions to the current POSIX standard are
> concerned, adding the herein (
> proposed change to *adddup2 semantics (clearing the FD_CLOEXEC flag on the
>  (given target in dup2 file actions where the source and target
> filedescriptor are identical) would be super nice (the reasoning for it
> should be in the link).

I wasn't aware of that part of #411; I agree it should be adopted now
without waiting for a new standard.

> Finally, I noticed the error passing code can be reduced to a simple write
> to a volatile int made from the child (purely cosmetic).

The pipe is there for a very important reason, and the reason is not
passing the error. It just happens that, since we already have to read
from it in the parent anyway, it makes a convenient channel to pass
the error on. Your modified version has serious memory corruption
because the parent continues running (and may return from posix_spawn)
after args->ret is set but before the child exits, thereby ending the
lifetime of the child's stack. Atomicity of closing of the
close-on-exec pipe with respect to exec is the property we're using
here to detect when exec has taken effect and memory is no longer

An argument can be made that CLONE_VFORK mandates the schedudler
prevent this, but historically it did not always do this correctly,
especially when traced (and I believe it's still possible when tracing
to manually resume the parent while the child is still sharing
memory), and the intent is to be compatible with kernels where
CLONE_VFORK is a nop. The main reason CLONE_VFORK is used is actually
as a hint to qemu-userlevel to get it to emulate things right.

> Attached are patches for these 3 things in case you wanted them. (I hope
> I'm not doing something stupid.)

Otherwise no, not "stupid", but the patches are a lot larger/more
invasive than they need to be and don't follow coding style.

> @@ -97,14 +108,14 @@ static int child(void *args_vp)
>  				__syscall(SYS_close, op->fd);
>  				break;
>  			case FDOP_DUP2:
> -				if ((ret=__sys_dup2(op->srcfd, op->fd))<0)
> +				if ((ret=dup3_or_set_cloexec(op->srcfd, op->fd, 0))<0)
>  					goto fail;
>  				break;
>  			case FDOP_OPEN:
>  				fd = __sys_open(op->path, op->oflag, op->mode);
>  				if ((ret=fd) < 0) goto fail;
>  				if (fd != op->fd) {
> -					if ((ret=__sys_dup2(fd, op->fd))<0)
> +					if ((ret=dup3_or_set_cloexec(fd, op->fd, 0))<0)

This (FDOP_OPEN) is not a case covered by the POSIX interpretation,
and using it here would actually be wrong if it were reachable, but
you're already inside "if (fd != op->fd)" and thus there's no way
fd==op->fd can be true. So the change in this line is just spurious.
There's then just one place where dup3_or_set_cloexec is called (the
previous case), and it doesn't make sense to have a function; rather
the conditional should just be inline in that case.


Powered by blists - more mailing lists

Your e-mail address:

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.