Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Mon, 31 Dec 2012 15:34:17 -0500
From: Rich Felker <>
Subject: vfork replacement proposal

I've been looking for a viable replacement of the vfork usage in musl
for a while, since it has two serious problems:

1. strace is buggy and causes the parent and child to run
simultaneously on the same stack under vfork when the process is being
traced. Binaries which can crash or go crazy under strace are highly
undesirable, even if the fault is with strace.

2. While current compilers don't do this, the compiler is conceptually
free to generate code that clobbers parts of the stack that still need
to be used by the parent when it determines they are no longer needed
in the child.

The affected functions are posix_spawn[p], system, and popen.

My new proposed design for these functions is:

1. Open a close-on-exec pipe.

2. Use clone with CLONE_VM|SIGCHLD as the flags to make a normal child
process that shares VM but nothing else with the parent, and that runs
a new function (rather than returning) on a small stack embedded in
the caller's stack (e.g. a 1k automatic char array).

3. In the parent close the write end of the pipe and perform blocking
read on the read end.

4. In the child, close the read end of the pipe and then shuffle file
descriptors as needed (for setting up stdin/out for popen, or file
actions for posix_spawn[p]), but with the added stipulations A-C:

A. Before closing or dup2'ing onto a file descriptor in file actions,
check to see if it's occupied by the pipe fd, and if so, use fcntl
F_DUPFD_CLOEXEC to move it to a new number first.

B. Before calling open in file actions, always use fcntl with
F_DUPFD_CLOEXEC and close the original pipe fd, to ensure that the
pipe is never occupying the otherwise-lowest-available fd number.

C. Any failure to renumber the pipe fd as required in A-B is fatal.

5. On any failure in the child, write the error code for the failure
to the pipe and _exit. This includes failure to renumber the pipe, or
failure in the final call to an exec-family function. Otherwise the
pipe closes on successful exec in the child.

6. If the parent reads 0 bytes (EOF) from the pipe, spawning the
external process was successful. Otherwise, the error code is
available indicating the cause of failure, and the cause can be
reported to the calling program via a failure return value, instead of
via immediate exit of the child process with result 127.

This final point 6 makes the proposed new design superior to all
existing implementations I know of: you get good data on the cause of
failure in the parent rather than a false success followed by
immediate exit with code 127 and no indication of the cause.

The key breakthrough that made this design proposal possible was
realizing that I can keep shuffling the pipe fd around in the child in
a simple way that avoids interference with the POSIX spawn file
actions. This is in contrast with the problem of determining in
advance a "safe" fd number to locate the pipe on, which is a
nontrivial problem when you can't know the existing set of open fds.

Before I go trying to implement this, anyone see problems with it?
Other comments?


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.