musl - Open conformance issues & plans

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180823204124.GM1878@brightrain.aerifal.cx>
Date: Thu, 23 Aug 2018 16:41:24 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Open conformance issues & plans

(Found by Adelie Linux's efforts to meet POSIX conformance)

1. O_SEARCH/O_EXEC issues

Linux does not actually implement these at all; we approximate them
with O_PATH. They actually should be redefined as O_PATH|3 so that we
can distinguish them from O_PATH, because there's at least one
important difference: with O_SEARCH or O_EXEC, O_NOFOLLOW is supposed
to cause failure rather than producing an fd for the symlink like
O_PATH does. This issue is one I've known about for a long time, not
from the Adelie testing.

The issues that did arise from testing are that open needs to fail
when the file lacks +x permission if O_SEARCH or O_EXEC is used. I'm
not sure how to achieve this; access uses the wrong permissions (real
rather than effective) and stat can't reflect ACLs or anything like
that.

Also reported was that fdopendir needs to fail if the fd was opened
with O_SEARCH rather than O_RDONLY. It should be possible to make
fdopendir probe this with fcntl but I haven't tested.

2. O_TTY_INIT

Also a known issue (that it's missing). We can't define this because
Linux failed to reserve a value for it. Not sure what to do.

3. fnmatch and glob corner cases

fnmatch spuriously succeeds when there's an escape character (\) at
the end of the pattern. This probably should be an error.

glob wrongly handles unreadable-but-searchable directory components. I
don't yet understand what it does vs what it's supposed to do.
Reported as:

    glob("unreadable_but_searchable_dir/a", GLOB_ERR, errfunc, pglob)
    returns GLOB_NOMATCH and calls errfunc (it should do neither of
    these things) [Kernel?]

4. regcomp

Several wrong error cases I don't yet understand, reported as:

    1. regcomp(preg, "xyab\\{3,\\}jk\\{", 0) (unbalanced \{\}) returns
    REG_BADBR instead of REG_EBRACE or REG_BADPAT
    2. regcomp(preg, "^?xyz", REG_EXTENDED) (? not proceeded by valid
    regex) succeeds instead of returning REG_BADRPT or REG_BADPAT
    3. regcomp(preg, "[][.-.]-0]", <any>) returns REG_ECOLLATE instead
    of succeeding (] represents itself in a bracket expression when it
    appears as the first character)

5. psiginfo

Wrongly affecting wide/byte orientation of stderr. Needs to take the
stdio lock itself so it can save and restore the orientation around
the call to fprintf.

6. fileno & non-fd FILEs

fileno is reportedly returning 0 for memory streams. This seems
implausible (they all set f->fd=-1) but it definitely is failing to
set errno to EBADF when f->fd is negative, which it's required to do
for FILEs without an underlying fd.

7. fmemopen & open_[w]memstream

fmemopen mode w+ reportedly doesn't truncate buffer.

open_[w]memstream don't pre-set the stream orientation to byte/wide as
they're supposed to (this is a stupid requirement; conceptually
there's no reason you couldn't have a wide memstream being written via
byte operations, or vice versa, but it's a requirement anyway...).

I think there are other known conformance problems here and in
open_[w]memstream that weren't reported.

8. Linux EISDIR bugs

Linux wrongly fails open with EISDIR when the pathname passed to it
ends in a / but the last component before the / is not a directory.
This affects open and fopen, maybe other things too. Not sure how to
work around it in libc.

9. freopen

Supposedly freopen has to assign fds as if it first closes the old fd
then opens (assigning lowest-free) the new file. If true this makes it
largely useless; right now musl is intentionally preserving the old fd
so that it can be used for replacing the standard streams. This needs
clarification from the Austin Group I think.

10. getdelim

The text of the standard seems to allow malloc/realloc only when the
buffer passed in is not already sufficiently large to hold the result.
The current loop logic we use will force resizing one byte early in
most cases, but can't be trivially changed not to do this without
creating overflows in certain cases (depending on buffering). I will
revisit this after the next release and refactor the loop, but it will
need careful attention to ensure we don't introduce new
bugs/overflows.

11. rename and ./..

Linux wrongly accepts . and .. as final component to rename. Maybe we
can just work around this as strings.

12. strtof/d/ld and ERANGE

Apparently they don't always set ERANGE on underflow like they're
supposed to. Need to investigate whether we're trying and failing or
what.

13. wordexp issues

Reported as:

    1. When WRDE_SHOWERR flag is not set, output is still written to
    stderr
    2. '|', '&', ';', '<', '>', '{ }', and '( )' are accepted in
    wordexp input instead of returning WRDE_BADCHAR [CVE?]
    3. WRDE_UNDEF is ignored [CVE?]
    4. wordexp("`for i in \ndone`", ...) succeeded instead of
    returning WRDE_SYNTAX

I don't think any of the [CVE?] issues are security-relevant; they're
all in the case where WRDE_NOCMD was omitted, in which case command
execution is assumed to be possible.

At some point wordexp probably needs a fairly significant overhaul so
I'm not too keen on spending time on the individual issues here, but
will look at fixes anyone wants to propose.

14. abort

The abort function needs to cause process termination as if by SIGABRT
in the case where a SIGABRT handler was installed but returns. Linux
provides no easy mechanism to do this, and we probably need to
emulate it in userspace by preventing reinstallation of a SIGABRT
handler after the first raise(SIGABRT) in abort() returns. I have an
idea for a design but it's a fair bit of work, and I'll probably
return to it after release.



There are also several math issues and small details I didn't mention
which came up in the Adelie testing, which I've omitted here because
this is getting too long.

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.