Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 3 Jul 2016 09:58:47 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: abort() fails to terminate PID 1 process

On Sun, Jul 03, 2016 at 12:43:59PM +0200, Igmar Palsenberg wrote:
> 
> > > That rule doesn't apply to pid 1 by default. Pid 1 should be a proper init 
> > > system, not a full blows application that makes the system blow up on 
> > > every error.
> > 
> > abort is specified to terminate the process no matter what.
> 
> Yes. But like mentioned : pid 1 is an exception to this.
> 
> > For it to
> > ever be able to return is a serious bug since both the compiler and
> > the programmer can assume any code after abort() is unreachable.
> 
> This specific case talked about pid 1. pid 1 has kernel protection, normal 
> userspace processes don't. In that case, the normal assumptions don't hold 
> up.

Whether you realize it or not, what you're saying is equivalent to
saying that it's UB for a process that runs as pid 1 to call abort().
There is no basis for such a claim.

A vague "pid 1 is special" rule (which the standard does not support
except in a few very specific places where an implementation-defined
set of processes are permitted to be treated in specific special ways)
does not imply "calling a function whose behavior is well-defined can
legitimately lead to runaway code execution if the pid is 1".

> > At
> > present musl avoids this worst-case failure (wrongfully returning)
> > with an infinite loop, but that's just a fail-safe. The intent is that
> > it terminate, and in particular, terminate abnormally as specified,
> > which we don't do enough to guarantee (SIGKILL is not "abnormal"
> > termination). So there's definitely work to be done to fix this. It's
> > an issue I've been aware of for a long time but the kernel makes it
> > painful to reliably produce abnormal termination without race
> > conditions.
> 
> Can this even be reproduced under normal circumstances (aka : not pid 1) ? 
> If thes, then I agree : It's a bug. If no : Then not. If people have a 
> broken container init system, then it breaks and they keep the pieces.

Yes.

> > > Well, normally abort() does some signal magic, and then raises again. 
> > > Which is what POSIX mandates I think.
> > 
> > To make this work reliably I think we need to make abort() take a lock
> > the precludes further calls to sigaction prior to re-raising SIGABRT
> > and resetting the disposition. But there are all sorts of
> > complications to deal with. For example if another thread performs
> > posix_spawn for fork and exec concurrent with abort() munging the
> > disposition of SIGABRT, the child process could start with the wrong
> > disposition for SIGABRT, which would be non-conforming. Finding ways
> > to fix all places where the wrong behavior may be observable is a
> > nontrivial problem.
> 
> Does the whole guaranteed termination also includes threaded programs ?

Of course. The fact that you're asking such basic questions tells me
that you're bikeshedding this based on negative opinions of certain
container usage cases and not offering constructive input based on
what the specification actually requires.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.