musl - Re: __synccall: deadlock and reliance on racy /proc/self/task

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190209162101.GN21289@port70.net>
Date: Sat, 9 Feb 2019 17:21:02 +0100
From: Szabolcs Nagy <nsz@...t70.net>
To: musl@...ts.openwall.com
Cc: Alexey Izbyshev <izbyshev@...ras.ru>
Subject: Re: __synccall: deadlock and reliance on racy /proc/self/task

* Rich Felker <dalias@...c.org> [2019-02-08 13:33:57 -0500]:
> On Fri, Feb 08, 2019 at 09:14:48PM +0300, Alexey Izbyshev wrote:
> > On 2/7/19 9:36 PM, Rich Felker wrote:
> > >>Of course, the larger problem remains: if we may miss some threads
> > >>because of /proc, we may fail to call setuid() syscall in those
> > >>threads. And that's indeed easily happens in my second test
> > >>(attached: test-setuid-mismatch.c; expected to be run as a suid
> > >>binary; note that I tested both with and without "presignalling").
> > >
> > >Does it work if we force two iterations of the readdir loop with no
> > >tasks missed, rather than just one, to catch the case of missed
> > >concurrent additions? I'm not sure. But all this makes me really
> > >uncomfortable with the current approach.
> > 
> > I've tested with 0, 1, 2 and 3 retries of the main loop if miss_cnt
> > == 0. The test eventually failed in all cases, with 0 retries
> > requiring only a handful of iterations, 1 -- on the order of 100, 2
> > -- on the order of 10000 and 3 -- on the order of 100000.
> 
> Do you have a theory on the mechanism of failure here? I'm guessing
> it's something like this: there's a thread that goes unseen in the
> first round, and during the second round, it creates a new thread and
> exits itself. The exit gets seen (again, it doesn't show up in the
> dirents) but the new thread it created still doesn't. Is that right?
> 
> In any case, it looks like the whole mechanism we're using is
> unreliable, so something needs to be done. My leaning is to go with
> the global thread list and atomicity of list-unlock with exit.

yes that sounds possible, i added some instrumentation to musl
and the trace shows situations like that before the deadlock,
exiting threads can even cause old (previously seen) entries to
disappear from the dir.

tm1 is libc.threads_minus_1, tt is target_tid,
r==3 is ESRCH, r==110 is ETIMEDOUT
my comments are indented:

trace 1:

16:01:45.473004716 [10825] __pthread_exit tm1 2 -> 1
16:01:45.473007835 [10827] start tm1 1 pid 2922 tid 10827
	current threads: 2922, 10827, (10825 is exiting)
	new iter
16:01:45.473012387 [2922] __synccall begin tm1 1
16:01:45.473022059 [2922] __synccall begin dir
16:01:45.473030024 [2922] __synccall tgkill 10825 r 3 tm1 1
16:01:45.473033342 [2922] __synccall end dir cnt 1 miss 0 r 3 tt 10825 (10825) tm1 1
	bad: dir had 10825, but not 10827 (which already made a write syscall before signals were sent)
16:01:45.473037483 [2922] __synccall end loop tm1 1 timedout 0 head 0
16:01:45.473040958 [2922] __synccall single_threaded tm1 1 head 0x7ffff7f19310
16:01:45.473040736 [10827] handler new head 0x7ffff7f19310 tt 10825 (10825) chain node: tid 10827 target_sem 0 0 caller_sem 0 0
	bad: late signal
16:01:45.473046943 [2922] __synccall finish chain node tid 10827 target_sem 0 0 caller_sem 0 0
	new iter
16:01:45.473063328 [2922] __synccall begin tm1 1
16:01:45.473067031 [2922] __synccall begin dir
16:01:45.473073274 [2922] __synccall tgkill 10827 r 0 tm1 1
16:01:45.483124243 [2922] __synccall end dir cnt 1 miss 1 r 110 tt -2147472821 (10827) tm1 1
16:01:45.483137678 [2922] __synccall begin dir
16:01:45.483148355 [2922] __synccall tgkill 10827 r 0 tm1 1
16:01:45.493243161 [2922] __synccall end dir cnt 1 miss 1 r 110 tt -2147472821 (10827) tm1 1

trace 2:

15:09:57.926599655 [19672] __pthread_create tm1 2 -> 3 flags 0x5d0f00 detached 
	printed right before __clone, current threads: 19667, 19670, 19672
	new iter:
15:09:57.926603357 [19667] __synccall begin tm1 3
15:09:57.926610882 [19667] __synccall begin dir
15:09:57.926618254 [19667] __synccall tgkill 19670 r 0 tm1 3
15:09:57.926617745 [19672] handler new head 0x7ffff7f5f310 tt 19670 (19670) chain node: tid 19672 target_sem 0 0 caller_sem 0 0
15:09:57.926589725 [19670] handler done
15:09:57.926633870 [19670] handler new head 0x7ffff7f3bcd0 tt -2147463981 (19667) chain node: tid 19670 target_sem 0 0 caller_sem 0 0
15:09:57.926637148 [19667] __synccall already caught tid 19672 tm1 3
15:09:57.926642781 [19667] __synccall end dir cnt 2 miss 0 r 0 tt 19672 (19672) tm1 3
	ok: dir had 19670, 19672 (new child of 19672 may not be created yet)
15:09:57.926649375 [19667] __synccall end loop tm1 3 timedout 0 head 0x7ffff7f3bcd0
15:09:57.926653791 [19667] __synccall call chain node tid 19670 target_sem -1 1 caller_sem 0 0
15:09:57.926668175 [19667] __synccall call chain node tid 19672 target_sem -1 1 caller_sem 0 0
15:09:57.926683042 [19667] __synccall single_threaded tm1 3 head 0x7ffff7f3bcd0
15:09:57.926688375 [19667] __synccall finish chain node tid 19670 target_sem -1 1 caller_sem 0 0
15:09:57.926695762 [19667] __synccall finish chain node tid 19672 target_sem -1 1 caller_sem 0 0
15:09:57.926698084 [19670] handler done
15:09:57.926476014 [19670] __pthread_create clone ret 19672
	non-monotonic timestamp: the instrumentation code got interrupted
15:09:57.926707218 [19670] __pthread_exit tm1 3 -> 2
	new iter:
15:09:57.926721373 [19667] __synccall begin tm1 2
15:09:57.926727938 [19667] __synccall begin dir
15:09:57.926738572 [19667] __synccall tgkill 19670 r 3 tm1 2
15:09:57.926743677 [19667] __synccall end dir cnt 1 miss 0 r 3 tt 19670 (19670) tm1 2
	bad: dir had 19670 (which exited) but 19672 is missing ???
15:09:57.926748951 [19672] handler done
15:09:57.926749685 [19667] __synccall end loop tm1 2 timedout 0 head 0
15:09:57.926756402 [19667] __synccall single_threaded tm1 2 head 0
15:09:57.926757551 [19672] handler new head 0x7ffff7f5f310 tt 19670 (19670) chain node: tid 19672 target_sem 0 0 caller_sem 0 0
	bad: late signal
15:09:57.926761019 [19667] __synccall finish chain node tid 19672 target_sem 0 0 caller_sem 0 0
	bad: got one sem_post, will still wait on another
	new iter:
15:09:57.926781060 [19667] __synccall begin tm1 2
15:09:57.926786600 [19667] __synccall begin dir
15:09:57.926793026 [19667] __synccall tgkill 19672 r 0 tm1 2
15:09:57.936881342 [19667] __synccall end dir cnt 1 miss 1 r 110 tt -2147463976 (19672) tm1 2
	now dir has 19672, but it's already waiting in the handler..
15:09:57.936902178 [19667] __synccall begin dir
15:09:57.936921680 [19667] __synccall tgkill 19672 r 0 tm1 2
15:09:57.947024323 [19667] __synccall end dir cnt 1 miss 1 r 110 tt -2147463976 (19672) tm1 2
15:09:57.947045505 [19667] __synccall begin dir
15:09:57.947064716 [19667] __synccall tgkill 19672 r 0 tm1 2
15:09:57.957161709 [19667] __synccall end dir cnt 1 miss 1 r 110 tt -2147463976 (19672) tm1 2
15:09:57.957196705 [19667] __synccall end loop tm1 2 timedout 3 head 0
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.