musl - Re: aio_cancel segmentation fault for in progress write requests

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181207200704.GG23599@brightrain.aerifal.cx>
Date: Fri, 7 Dec 2018 15:07:04 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: aio_cancel segmentation fault for in progress write
 requests

On Fri, Dec 07, 2018 at 01:05:53PM -0600, A. Wilcox wrote:
> On 12/07/18 12:26, Rich Felker wrote:
> > On Fri, Dec 07, 2018 at 11:31:01AM -0600, A. Wilcox wrote:
> >> awilcox on gwyn [pts/7 Fri 7 11:29] ~: ./aioWrite
> >> zsh: segmentation fault  ./aioWrite
> >>
> >> (gdb) run
> >> Starting program: /home/awilcox/aioWrite
> >> [New LWP 60165]
> >> [LWP 60165 exited]
> >> aio_write/1-1.c cancelationStatus : 2
> >> Test PASSED
> >> [Inferior 1 (process 60162) exited normally]
> >> (gdb) quit
> >>
> > I don't think so. I'm concerned that it's a stack overflow, and that
> > somehow the kernel folks have managed to break the MINSIGSTKSZ ABI.
> > AIO threads use a PTHREAD_STACK_MIN-sized stack with no guard page
> > (because they don't run any application code, just a tiny stub
> > function) but this could overflow in kernelspace (and either crash or
> > clobber memory depending on memory layout and presence/absence of
> > ASLR) if the kernel is making a signal frame that's too big. Note that
> > this would have to be nearly twice MINSIGSTKSZ (on x86 at least) due
> > to rounding up to whole pages, so if the kernel is misbehaving here
> > it's *badly* misbehaving...
> 
> Note how for me, it runs correctly in gdb, but not bare.  I can
> reproduce this behaviour in valgrind, too:
> 
> awilcox on gwyn [pts/7 Fri 7 13:03] ~: valgrind ./aioWrite
> ==47650== Memcheck, a memory error detector
> ==47650== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
> ==47650== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
> ==47650== Command: ./aioWrite
> ==47650==
> --47650-- WARNING: unhandled ppc64be-linux syscall: 208
> --47650-- You may be able to write your own handler.
> --47650-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
> --47650-- Nevertheless we consider this a bug.  Please report
> --47650-- it at http://valgrind.org/support/bug_reports.html.
> aio_write/1-1.c cancelationStatus : 2
> Test PASSED
> ==47650==
> ==47650== HEAP SUMMARY:
> ==47650==     in use at exit: 7,574 bytes in 5 blocks
> ==47650==   total heap usage: 6 allocs, 1 frees, 7,694 bytes allocated
> ==47650==
> ==47650== LEAK SUMMARY:
> ==47650==    definitely lost: 0 bytes in 0 blocks
> ==47650==    indirectly lost: 0 bytes in 0 blocks
> ==47650==      possibly lost: 0 bytes in 0 blocks
> ==47650==    still reachable: 7,168 bytes in 4 blocks
> ==47650==         suppressed: 406 bytes in 1 blocks
> ==47650== Rerun with --leak-check=full to see details of leaked memory
> ==47650==
> ==47650== For counts of detected and suppressed errors, rerun with: -v
> ==47650== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
> 
> 
> (syscall 208 is tkill)

It runs ok for you under valgrind? It was messing up for me (crashing
with static linking, getting stuck bad with dynamic) and that's what
suggested stack overflow to me (since valgrind likely uses a lot of
stack emulating stuff). This was on x86_64.

Rich

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.