Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 2 Jan 2018 14:58:25 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: a third bug in musl clone()

On Tue, Jan 02, 2018 at 10:24:42AM -0800, John Reiser wrote:
> In addition to the bugs in __clone for i386 with %gs and stack alignment
> in the new thread, there is a third bug in musl's implementation of clone().
> clone() takes optional arguments that need not be present,
> yet musl/src/linux/clone.c fetches them anyway.  This can cause SIGSEGV.
> 
> ===== musl/src/linux/clone.c excerpts
> int clone(int (*func)(void *), void *stack, int flags, void *arg, ...)
> {
>    [[snip]]
>         va_start(ap, arg);
>         ptid = va_arg(ap, pid_t *);
>         tls  = va_arg(ap, void *);
>         ctid = va_arg(ap, pid_t *);
>         va_end(ap);
> 
>         return __syscall_ret(__clone(func, stack, flags, arg, ptid, tls, ctid));
> }
> =====
> The presence of ptid, tls, and ctid is indicated by bits in 'flags':
> CLONE_PARENT_SETTID, CLONE_SETTLS, CLONE_CHILD_SETTID/CLONE_CHILD_CLEARTID.
> If none of those bits are set, then it could be that none of the variable
> arguments are present; therefore none of them should be fetched, and 0 (NULL)
> should be passed to __clone() for each of ptid, tls, ctid.
> [The meaning is unclear if any omitted argument is followed by an argument
> that is flagged as present.  Should the implementation call the corresponding
> va_arg(), or skip over it?]
> 
> How SIGSEGV can be generated: It is valid for &arg to be the address of
> the last word on a hardware page: 0x...ffc on a 32-bit CPU with 4KiB pages,
> with the following page unmapped.  &func would be 16-byte aligned at 0x...ff0.
> Any one of the va_arg() calls would attempt to fetch from the next
> page at address 0x...000 or greater, which will generate SIGSEGV.

This is undefined behavior and should be corrected, but I disagree
with your claimed mechanism of failure. clone() is necessarily called
by some other function, which has its own call frame, yielding a worst
case of:

0. clone's return address (aligned -4 mod 16)
4. func
8. stack
12. flags
16. arg
20. ??
24. ??
28. ??
32. caller's return address (aligned -4 mod 16)

In any case it should be fixed by checking flags.

As an aside, I don't think there are any valid (supportable) ways to
use the variadic args anyway (at least the tls slot is not usable;
maybe the tids are?), so it's possible that we should just remove
support for them instead. Code that wants to do really wacky things
here needs to be written in 100% asm anyway since calls to C from a
pseudo-thread created by manual calls to clone will violate the ABI
(invalid or duplicate thread pointer) and if you're writing asm for
something special you should just use SYS_clone yourself...

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.