Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Mon, 27 May 2024 16:56:06 +0200
From: Markus Wichmann <nullplan@....net>
To: musl@...ts.openwall.com
Cc: IMMING <2465853002@...com>
Subject: Re: A question about the implementation of pthread_create and
 start

Am Mon, May 27, 2024 at 08:22:42PM +0800 schrieb IMMING:
> Hi,&nbsp;I would like to ask a question about the implementation of pthread_create and&nbsp;start (musl v1.2.5)
>

Bloody hell, please fix your mail client settings to stop emitting these
HTML entities in the plain text! Our MUAs are grown up, they can handle
non-ASCII if it's properly declared.

> My question is, if SYS_sched_setscheduler returns an error (a non-zero
> value), the parent thread will remain in a wait state and I have not
> found a way to wake it, which will cause the parent thread to remain
> stuck in the pthread_create function and unable to return
>
> 1.Is my analysis process correct?

No. The call to SYS_set_tid_address sets the child's TID address to
&a->control, and the CLONE_CHILD_CLEARTID flag. This means that as soon
as the child exits, the kernel will set that address to 0 and perform a
futex wake on it. This is the same mechanism normally used for the
thread list on exit.

This means that the parent thread will be woken up as soon as the child
thread exits. Reason is that this way, the parent thread gets to clean
up the child memory by unmapping it. The only other way to achieve the
same thing would be to detach the child thread and let it clean up
itself. But this has the problem that we would need to publish the child
thread in the list, and it would be visible to the other threads, even
though it is at that point doomed to exit.

So it is just cleaner for the parent to clean up, since the parent needs
the cleanup code anyway in case the __clone() fails.

> 2.Is the situation where the parent thread gets stuck in the waite as
> expected?

It is not expected that the parent gets stuck indefinitely. Once the
failure of sched_setscheduler has been communicated, it is expected that
the child should exit quickly, and that's when the parent is woken up.
Or have you identified a case where the parent does get stuck forever?
Ciao,
Markus

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.