musl - Re: SIGSEGV/stack overflow in pthread_create

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <AB450E42-985F-4F8C-923A-E3910B37905F@plan44.ch>
Date: Fri, 13 Sep 2024 21:26:27 +0200
From: Lukas Zeller <luz@...n44.ch>
To: alice <alice@...ya.dev>,
 musl@...ts.openwall.com
Subject: Re: SIGSEGV/stack overflow in pthread_create - race condition?

Hi Alice,

> On 13 Sep 2024, at 17:34, alice <alice@...ya.dev> wrote:
> 
> something that is probably leading to confusion here is the infinite backtrace
> in clone.

Confusing indeed ;-)

> iirc this was related to the lack of cfi directives on arm for gdb to
> unwind correctly? or old debugger, it's been a long time..

Oh, thanks - so that might just be a gdb display artifact?
I am using the toolset of OpenWrt 22.03, which is gcc-11.2.0. So not really old, but definitely not latest.

> the actual code does not loop there forever;

Not forever, it obviously eventually exits from clone() into the thread function pkb_run_blocker() via the start() wrapper.
Which then hits the stack limit. 

> all the actual stack use is in the
> frames above in the application. it's just an unwinder shortcoming.

Maybe, but I'm not sure, see below

> something you can check in this case (from memory with lldb on a coredump or the
> application when it crashes):
> 
> f 0 #crashed frame
> reg read $sp #stack pointer
> f 7 #clone entry
> reg read $sp #stack pointer
> 
> and then subtract the two values you get for $sp. you'll see something like
> '131000' or bigger; and that means the code overflowed the 128k default musl
> stack size.

Ok, here's what I got:

> (gdb) f 0
> #0  0xb6ec42f0 in pk_parse_kite_request ()
>    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> (gdb) info reg $sp
> sp             0xb6adbbd0          0xb6adbbd0
> (gdb) f 7
> #7  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> 23 bl 3f
> (gdb) info reg $sp
> sp             0xb6adfd60          0xb6adfd60

0xb6adfd60 - 0xb6adbbd0 = 0x4190 = 16784

So that child thread has put only 16k on the stack between starting and when it crashes.


Interestingly, however, the previous (bogus?) stack frames each seem to have consumed 16 bytes.
This is what I would expect from "stmfd sp!,{r4,r5,r6,r7}" on line 7 of clone.s

> (gdb) f 8
> #8  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> 23 bl 3f
> (gdb) info reg $sp
> sp             0xb6adfd70          0xb6adfd70
> (gdb) f 9
> #9  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> 23 bl 3f
> (gdb) info reg $sp
> sp             0xb6adfd80          0xb6adfd80
> (gdb) 
> 
> ...
> 
> (gdb) f 1000
> #1000 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> 23 bl 3f
> (gdb) info reg $sp
> sp             0xb6ae3b70          0xb6ae3b70
> (gdb) 

0xb6ae3b70 - 0xb6adfd80 = 0x3DF0 -> ~16k consumed by 1000 "frames"

Unless the shown $sp values are gdb display artifacts as well, these "iterations" DO consume stack space and very much look like a real recursion happening within __clone().

The only path I can imagine would be start() in pthread_create():194 causing __clone() somehow to get called again on the same thread instead of calling the actual thread function passed to pthread_create (pagekite's pkb_run_blocker). This would push those 4 regs / 16 bytes on the stack.

Why the clone syscall then would not create a new thread, but still return 0 indicating a new child, I have no idea. I do not understand the preamble to calling the thread function in start(), but can see that delicate things happen there with a_cas(), __wait() and SYS_set_tid_address.

Does this make any sense to you?

One piece of context that might be relevant: I found libpagekite calls that pthread_create() 16 times (MAX_BLOCKING_THREADS) in a row in a quite tight loop:

> int pkb_start_blockers(struct pk_manager *pkm, int n)
> {
>   int i;
>   for (i = 0; i < MAX_BLOCKING_THREADS && n > 0; i++) {
>     if (pkm->blocking_threads[i] == NULL) {
>       pkm->blocking_threads[i] = malloc(sizeof(struct pk_blocker));
>       pkm->blocking_threads[i]->manager = pkm;
>       if (0 > pthread_create(&(pkm->blocking_threads[i]->thread), NULL,
>                              pkb_run_blocker,
>                              (void *) pkm->blocking_threads[i])) {
>         pk_log(PK_LOG_MANAGER_ERROR, "Failed to start blocking thread.");
>         free(pkm->blocking_threads[i]);
>         pkm->blocking_threads[i] = NULL;
>         return (pk_error = ERR_NO_THREAD);
>       }
>       n--;
>     }
>     else {
>       pk_log(PK_LOG_MANAGER_ERROR, "Blocking thread %d already started?", i);
>     }
>   }
>   return 0;
> }

PS: having gdb doing "f 1000" took a minute, "f 2000" about 10 and "f 4000" took an hour...

--
Lukas Zeller, plan44.ch
luz@...n44.ch - https://plan44.ch
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.