Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 3 Feb 2021 16:01:51 -0500
From: Rich Felker <dalias@...c.org>
To: Dominic Chen <d.c.ddcc@...il.com>
Cc: fweimer@...hat.com, musl@...ts.openwall.com
Subject: Re: Incorrect thread TID caching

On Wed, Feb 03, 2021 at 03:21:06PM -0500, Dominic Chen wrote:
> 
> On 2/3/2021 2:16 AM, Florian Weimer wrote:
> >If you use the clone system call wrapper in threading (not fork/vfork)
> >mode, you cannot call any libc functions afterwards, including the
> >syscall function.  Instead, you have to issue direct system calls.
> On 2/3/2021 2:21 PM, Rich Felker wrote:
> >Unfortunately it's really underdocumented and underexplored what a
> >child created with clone() can do. There are definitely limitations --
> >for example any usage with CLONE_VM or CLONE_THREAD is restricted not
> >to call into libc at all, and might not even be safe whatsoever.
> >However basic usage comparable in semantics to _Fork is probably
> >supposed to work at least as well as _Fork -- in particular calling
> >AS-safe libc functions should work.
> 
> I wasn't aware of this behavior, and didn't see any documentation
> about this for the glibc clone() wrapper either. This seems to be a
> big footgun, and after looking through the history for this code in
> Chrome, it looks like they had a similar issue with glibc too.

Yes that's what I mean by underdocumented.

> >BTW does Chrom{e,ium} itself do something with raw clone? If so this
> >could be a source of some of the bugs users hit, and it would be great
> >to get a clearer picture on what's happening.
> 
> The code in question is a unittest for the sandbox, which manually
> calls clone with CLONE_NEWPID to fork a child in a PID namespace,
> then installs a signal handler and checks that it receives SIGTERM
> correctly: https://source.chromium.org/chromium/chromium/src/+/master:sandbox/linux/services/namespace_sandbox_unittest.cc;l=194
> .. But under musl, raise() uses the cached TID value, so the test
> eventually times out.

OK, raise should probably just be changed here to work even in vforked
child since it seems plausible someone will use it there. It's not
like saving the syscall actually matters here. But that's independent
of the clone() issue.

> I missed that the NamespaceSandbox::ForkInNewPidNamespace() function
> does manually update the cached TID for glibc after calling the
> ForkWithFlags wrapper, so I can just do the same for musl too.

This isn't valid; the location is not ABI. You could very well end up
clobbering a pointer or something unrelated. The issue should just be
fixed on the musl side.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.