Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 5 Oct 2023 20:34:03 +0200
From: Markus Wichmann <nullplan@....net>
To: musl@...ts.openwall.com
Subject: Re: Hung processes with althttpd web server

Am Thu, Oct 05, 2023 at 08:39:03AM -0400 schrieb Rich Felker:
> > Am Wed, Oct 04, 2023 at 09:41:41PM -0400 schrieb Carl Chave:
> > > futex(0x7f5bdcd77900, FUTEX_WAIT_PRIVATE, 4294967295, NULL
> It would still be
> interesting to know which lock is being hit here, since for the most
> part, locks are skipped in single-threaded processes.

The only hints to that we have right now are the futex address and
value. The address looks like it would be in some mmap()ed memory, but
that could be anything. The value is more interesting. Because it shows
us that the object is set to 0xffffffff when taken by a thread and a
single waiter is present. And the only synchronization object I could
find that does that is pthread_rwlock_t. There would also be sem_t for
older musl versions, but since the wait flag overhaul it isn't anymore,
and that was last year.

I know that musl has some internal rwlocks, but the only one I could
find that would plausibly be locked in write mode is the lock in
dynlink.c, which is wrlocked in __libc_exit_fini(), among other places.
Of course, the signal handler in althttpd also calls exit(), which may
reach there. So it might be that the signal hit while the code was
trying to exit, leading to a re-entrant call to exit(), and therefore to
a deadlock. So that's possible. But the time the lock is taken in
__libc_exit_fini() is so small, this theory seems like a reach.

There is also __inhibit_ptc(), but I could find nothing that would call
that in a single-threaded process. Let alone twice.

Ciao,
Markus

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.