Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 26 Jun 2019 14:43:31 +0100
From: Radostin Stoyanov <rstoyanov1@...il.com>
To: musl@...ts.openwall.com, Rich Felker <dalias@...ifal.cx>, nsz@...t70.net
Cc: "Andrei Vagin (C)" <avagin@...il.com>, gorcunov@...il.com
Subject: Re: Re: seccomp causes pthread_join() to hang

On 26/06/2019 12:25, Szabolcs Nagy wrote:
> * Radostin Stoyanov <rstoyanov1@...il.com> [2019-06-26 08:30:34 +0100]:
>> On 26/06/2019 00:26, Rich Felker wrote:
>>>    Any configuration
>>> that results in a thread being terminated out from under the process
>>> has all sorts of extremely dangerous conditions with memory/locks
>>> being left in inconsistent state, tid reuse while the application
>>> thinks the old thread is still alive, etc., and fundamentally can't be
>>> supported. What you're seeing is exposure of a serious existing
>>> problem with this seccomp usage, not a regression.
>> I wrote "Regression: Yes" because this bug was recently introduced and it
>> does not occur in previous versions.
>>
>> IMHO causing pthread_join() to hang when a thread has been terminated is not
>> expected behaviour, at least because the man page for pthread_join(3)
>> states:
> the point is that if *any* libc api is used in the killed thread
> or a libc api is used to create that thread fundamentally breaks
> assumptions the c runtime may rely on and thus *any* libc call
> after the kill is undefined.
>
> so it's not just pthread_join that's broken but *everything*.
>
> this affects glibc too and old musl too, even if you may only
> observe the particlar pthread_join problem with a current musl.
>
> if the killed thread was in a signal handler that interrupted
> arbitrary libc operation then it obviously breaks everything,
> but even without that the libc will hold onto thread specific
> internal state and whenever that is used it can cause problems
> (in case of musl it is used in pthread_join, glibc uses it e.g.
> for set*id operations)
Thank you for the explanation Rich and Szabolcs!

The test case we have for CRIU is essentially loading a seccomp filter, 
performing a checkpoint/restore and then verifies that the seccomp 
filter was restored.

Assuming that behaviour of pthread_join is undefined when the thread has 
been terminated by seccomp, we can refactor the test case to work around 
the issue.

Radostin

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.