Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 12 Sep 2018 14:03:10 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: string-backed FILEs mess

On Wed, Sep 12, 2018 at 07:41:12PM +0200, Markus Wichmann wrote:
> On Wed, Sep 12, 2018 at 11:43:06AM -0400, Rich Felker wrote:
> > On Wed, Sep 12, 2018 at 05:09:41PM +0200, Markus Wichmann wrote:
> > > Well, first of all, I might set my foot wrong here very badly, but I
> > > generally don't care about C standard UB as long as the behavior is
> > > defined elsewhere.
> > 
> > Like where? In order for it to be defined, the *compiler* has to
> > define it, since otherwise it can make transformations that assume the
> > behavior is undefined. So what you're asking for here is basically
> > amounting to only supporting certain compilers (with certain flags),
> > and notably *not supporting* UBSan, which is a really valuable tool
> > for catching bugs.
> 
> Oh, I didn't think of that. But the compiler still has to follow the
> ABI, and the ABI says we have linear addresses.

The ABI defines an interface boundary. These transformations do not
take place at or across boundaries but in contexts where no boundary
is present and ABI does not apply. The possibility of them is the only
reason a tool like UBSan or _FORTIFY_SOURCE can work; otherwise what
these tools do would be invalid, arbitrarily breaking well-defined
code because they think it's bad style rather than justifiedly
changing the behavior of cases where the behavior is not defined.

> So the pointer to
> integer mapping still has to work, and (void*)-1 is defined in the SysV
> ABI. Wouldn't make much sense for DOS, but hey, that's not a supported
> platform. (Actually that's a bad example, because it would totally make
> sense as the far pointer to FFFF:FFFF, but you get my point.)

Creating a pointer like (void*)-1 is implementation- (platform ABI-)
defined, but that doesn't mean you can perform arbitrary operations on
it and have the results be meaningful or even defined. In particular
the -,<,<=,>,>= operators are only defined for a pair of pointers into
the same array. Since (void*)-1 is not a pointer into any array
object, comparing or subtracting it is not defined.

> Besides, you're opening a very scary door there: The C standard's
> chapter 7 contains a whole lot of UB in the library, and a compiler
> writer could now say: Since it is undefined, obviously it is never going
> to happen (and if it does, it is your own fault), so I can write the
> optimizer to assume all arguments to functions are such that UB does not
> occur. The standard says fflush() is only defined for output streams, so
> we're going to assume any stream passed into fflush() is an output
> stream and... I don't know, assume all input functions are going to fail
> until the next fseek()? Actually, I'm drawing a blank as to what they
> could do with this, but the GCC folks would find a way to mess with my
> code.

That's exactly why we have to use -ffreestanding, which says we want
to use the compiler as a freestanding C implementation that does not
include the standard library functions and corresponding assumptions
about their behavior. Without that, for example, the code in calloc
that only zero-fills the buffer produced by malloc if it's not already
zero will be optimized out, since the inspection of uninitialized
memory from malloc is undefined. Likewise the implementation of memcpy
could be optimized to a call to itself (infinite recursion).

> As for UBSan: Can't these sanitizers get their fingers out of the system
> implementation?

If you use UBSan for an application, it's of course not going to do
anything to code in libc. However you can also use it when building
libc, in order to find dangerous bugs. See this recent GCC bug report
where, if not for a bug in UBSan, it would have caught a serious,
dangerous regression in musl:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87191

(Thankfully it was caught manually before release.)

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.