Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 23 Apr 2015 05:55:06 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: setenv if value=NULL, what say standard? Bug?

On Thu, Apr 23, 2015 at 10:05:01AM +0200, Jens Gustedt wrote:
> Hello
> 
> Am Mittwoch, den 22.04.2015, 22:15 -0400 schrieb Rich Felker:
> > On Wed, Apr 22, 2015 at 09:26:57PM -0400, Jean-Marc Pigeon wrote:
> > > > I think the only safe conclusion is that the application is
> > > > incorrect and should ensure that setenv() is never called with a
> > > > NULL value.
> > > > 
> > > Checked glibc, My understanding, it set something as
> > > "name="
> > > in the environment, so the variable is present but
> > > value is "empty"i (top application to decide what to do).
> > > uclibc does something similar (as far I can tell looking
> > > at source code)..
> > > 
> > > 
> > > The application is not careful enough, but not incorrect as such.
> > 
> > It's definitely incorrect. It's doing something that invokes undefined
> > behavior.
> 
> You probably mean it *has* undefined behavior. UB is nothing that can
> be invoked.

This is rather standard wording, and is meant to emphasize that all
the rules of undefined behavior come into play (i.e. are invoked).

> Yes, it actually has two bugs in the case that was the starting point
> of this thread. It has to calls to libc functions where it doesn't
> check the return values.

And this failure to check the return value was exactly my point in the
text I cited about why it's not a good idea to return an error on UB:
programs sufficiently buggy to be doing things like this in practice
almost never check return values.

> > > If this situation is indeed UB, there is 2 options for musl:
> > > 1) Swallow the problem nicely... as glibc and uclibc does.
> > > 2) Report an error.. EINVAL? (and document it in manual)
> > > 
> > > Crashing at "libc" level is not an option.
> > 
> > I can see how it might seem like that at first, but crashing is
> > actually the best possible behavior. Options 1 and 2 cover up a
> > potentially serious bug -- it's not clear what the application was
> > trying to do, most likely nobody even thought about what they were
> > trying to do, and even if they did have something in mind it's not
> > reliable or portable. The glibc wiki has some text taken from text I
> > wrote on the topic (copied from a stack overflow answer I gave) here:
> > 
> > https://sourceware.org/glibc/wiki/Style_and_Conventions#Invalid_pointers
> > 
> > Specifically it covers why returning an error is not a good idea.
> 
> I see your point, but I would go a bit more moderately and more
> pragmatically with it.
> 
> First of all UB is what it is, a specific standard (here POSIX)
> doesn't impose any form of behavior. So an implementation may extend
> the behavior as it pleases. (Otherwise these standards have a saying
> "it is unspecified whether ... or not ...")
> 
> Now, failing early is certainly a good property when we can expect
> just that; any application that uses the call in that way *will* in
> fact fail early. This is particularly important in code that otherwise
> will have a performance penalty for doing checks.

The goal is not to avoid a performance penalty but to avoid
propagating an error to a point where it's potentially hard to
diagnose or worse.

> Another acceptable strategy, IMPOV, is to forward errors where this is
> easy to do and the check doesn't impose an unacceptable penalty. The
> application then can handle the error (or not).
> 
> setenv is certainly borderline. Code for which it is performance
> critical is almost certainly broken in many ways, and on the other
> hand failures in the way we have seen here can be rare and late.

What should happen, though? Should it set TZ to blank? Unset it? Leave
it alone? If the program proceeds with a behavior contrary to the
programmer's intent, it's going to do the wrong thing.

> So I would have a small preference for being nice: do nothing to the
> environment and return an error.

I have not read the full hwclock code in question but I suspect that
the third option, "leave it alone" (and fail with EINVAL) might cause
hwclock to treat the RTC-stored time incorrectly (e.g. as localtime
rather than UTC) which would be rather bad. In other situations it
might allow a dangerous env var that was intended to be cleared to
pass through.

An additional danger of assigning a behavior to cases like this is
that the implementation becomes invalid, and had to change, breaking
programs that were depending upon the behavior, if a future version of
the standard mandates a particular behavior where the behavior was
previously undefined.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.