Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 30 Oct 2020 00:32:52 +0100
From: Szabolcs Nagy <nsz@...t70.net>
To: Milan P. Stanić <mps@...anta.net>
Cc: musl@...ts.openwall.com
Subject: Re: [PATCH v2] MT fork

* Milan P. Stanić <mps@...anta.net> [2020-10-30 00:00:07 +0100]:
> On Thu, 2020-10-29 at 23:21, Szabolcs Nagy wrote:
> > * Milan P. Stanić <mps@...anta.net> [2020-10-29 21:55:41 +0100]:
> > > On Thu, 2020-10-29 at 17:13, Szabolcs Nagy wrote:
> > > > * Milan P. Stanić <mps@...anta.net> [2020-10-29 00:06:10 +0100]:
> > > > >  
> > > > > Applied this patch on top of current musl master, build it on Alpine and
> > > > > installed.
> > > > > 
> > > > > Tested by building ruby lang. Works fine.
> > > > > Also tested building zig lang, works fine.
> > > > > But crystal lang builds fine, but running it hangs. strace shows:
> > > > > -------------
> > > > > [pid  5573] futex(0x7efc50fba9e4, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
> > > > > [pid  5568] futex(0x7efc5118f984, FUTEX_REQUEUE_PRIVATE, 0, 1, 0x7efc514b67a4) = 1
> > > > > [pid  5568] futex(0x7efc514b67a4, FUTEX_WAKE_PRIVATE, 1) = 1
> > > > > [pid  5571] <... futex resumed>)        = 0
> > > > > [pid  5568] futex(0x7efc511099e4, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
> > > > > [pid  5571] futex(0x7efc510409e4, FUTEX_WAIT_PRIVATE, 2, NULL
> > > > > -------------
> > > > > where it hangs.
> > > > 
> > > > try to attach gdb to the process that hang and do
> > > > 
> > > > thread apply all bt
...
> I did continue now and stopped it when it hangs. Attached is gdb log
> produced by 'thread apply all bt'

unfortunately it's hard to tell what's going on.
all threads wait on the same cond var in libgc.
but it does not look like a fork issue to me.

to get further i think we would need to look at the
libgc logic, to see how it uses those cond vars.

all 16 threads look like

#0  __syscall_cp_c (nr=202, u=140737284532708, v=128, w=2, x=0, y=0, z=0) at ./arch/x86_64/syscall_arch.h:61
#1  0x00007ffff7fb8937 in __futex4_cp (to=0x0, val=2, op=128, addr=0x7ffff3d9e9e4) at src/thread/__timedwait.c:52
#2  __timedwait_cp (addr=addr@...ry=0x7ffff3d9e9e4, val=val@...ry=2, clk=clk@...ry=0, at=at@...ry=0x0, priv=128, priv@...ry=1) at src/thread/__timedwait.c:52
#3  0x00007ffff7fb9740 in __pthread_cond_timedwait (c=0x7ffff403fba0, m=0x7ffff403f7a0, ts=0x0) at src/thread/pthread_cond_timedwait.c:100
#4  0x00007ffff40206fd in GC_wait_marker () from /usr/lib/libgc.so.1
#5  0x00007ffff40188b9 in GC_help_marker () from /usr/lib/libgc.so.1
#6  0x00007ffff40206db in GC_mark_thread () from /usr/lib/libgc.so.1
...

except the main thread is like

#0  __syscall_cp_c (nr=202, u=140737488349316, v=128, w=2, x=0, y=0, z=0) at ./arch/x86_64/syscall_arch.h:61
#1  0x00007ffff7fb8937 in __futex4_cp (to=0x0, val=2, op=128, addr=0x7fffffffe884) at src/thread/__timedwait.c:52
#2  __timedwait_cp (addr=addr@...ry=0x7fffffffe884, val=val@...ry=2, clk=clk@...ry=0, at=at@...ry=0x0, priv=128, priv@...ry=1) at src/thread/__timedwait.c:52
#3  0x00007ffff7fb9740 in __pthread_cond_timedwait (c=0x7ffff403fba0, m=0x7ffff403f7a0, ts=0x0) at src/thread/pthread_cond_timedwait.c:100
#4  0x00007ffff40206fd in GC_wait_marker () from /usr/lib/libgc.so.1
#5  0x00007ffff4018852 in GC_do_parallel_mark () from /usr/lib/libgc.so.1
#6  0x00007ffff401937c in GC_mark_some () from /usr/lib/libgc.so.1
...

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.