Date: Fri, 7 Dec 2018 18:50:40 -0500 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: aio_cancel segmentation fault for in progress write requests On Fri, Dec 07, 2018 at 04:51:03PM -0600, A. Wilcox wrote: > On 12/07/18 14:35, Markus Wichmann wrote: > > On Fri, Dec 07, 2018 at 01:13:44PM -0600, A. Wilcox wrote: > >> So, my best theory is that running inside a debugger (gdb, valgrind) > >> makes it slow enough that it no longer races. > > > > Two ideas to investigate further. 1: Produce a coredump ("ulimit -c > > unlimited"). That won't interfere with timing, but I have no clue if > > coredumps work with multithreading. > > Core was generated by `./aioWrite '. > Program terminated with signal SIGSEGV, Segmentation fault. > #0 __cp_end () at src/thread/powerpc64/syscall_cp.s:32 > 32 src/thread/powerpc64/syscall_cp.s: No such file or directory. > [Current thread is 1 (LWP 5507)] > (gdb) bt > #0 __cp_end () at src/thread/powerpc64/syscall_cp.s:32 > #1 0x00003fffa768f2a4 in __syscall_cp_c (nr=180, u=512512, v=0, w=0, > x=0, y=0, z=0) at src/thread/pthread_cancel.c:35 > #2 0x00003fffa768e008 in __syscall_cp (nr=<optimized out>, u=<optimized > out>, v=<optimized out>, w=<optimized out>, x=<optimized out>, > y=<optimized out>, z=<optimized out>) at src/thread/__syscall_cp.c:20 > #3 0x00003fffa76969f4 in pwrite (fd=<optimized out>, buf=<optimized > out>, size=<optimized out>, ofs=<optimized out>) at src/unistd/pwrite.c:7 > #4 0x00003fffa763eddc in io_thread_func (ctx=<optimized out>) at > src/aio/aio.c:240 > #5 0x00003fffa768f76c in start (p=0x3fffa76e8af8) at > src/thread/pthread_create.c:147 > #6 0x00003fffa769b608 in __clone () at src/thread/powerpc64/clone.s:43 > (gdb) thread 2 > [Switching to thread 2 (LWP 5506)] > #0 0x00003fffa7637144 in __syscall4 (d=0, c=-1, b=128, a=512, n=221) at > ./arch/powerpc64/syscall_arch.h:54 > 54 ./arch/powerpc64/syscall_arch.h: No such file or directory. > (gdb) bt > #0 0x00003fffa7637144 in __syscall4 (d=0, c=-1, b=128, a=512, n=221) at > ./arch/powerpc64/syscall_arch.h:54 > #1 __wait (addr=0x200, waiters=0x0, val=<optimized out>, > priv=<optimized out>) at src/thread/__wait.c:13 > #2 0x00003fffa763f07c in aio_cancel (fd=<optimized out>, > cb=0x3fffffafd2b8) at src/aio/aio.c:356 > #3 0x000000012034c044 in main () > > > 221 is SYS_futex. Wow, that looks wrong. I don't think thread 2 (odd numbering; it looks like the main thread) is relevant to the crash; it's alread proceeded past whatever was happening when thread 1 (the io thread) started crashing. I'm guessing it is stack overflow. Can you dump the registers (to see the stack pointer value) and info about memory ranges? That should show how much space is left on the stack at the point of crash. If the crash is the signal handler trying to run, there will probably be some space left but less than the size of a signal frame, and the kernel will probably refrain from moving the stack pointer to include the signal frame. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.