musl - Re: Fix pthread_create on some devices failing to initialize guard area

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <037fdb37-b8d3-e79d-4baf-e2581c8900bf@gmail.com>
Date: Tue, 31 Jan 2017 13:18:50 -0800
From: Eric Hassold <hassold@...il.com>
To: musl@...ts.openwall.com
Subject: Re: Fix pthread_create on some devices failing to initialize
 guard area


On 1/30/17 7:58 PM, Rich Felker wrote:
> On Fri, Jan 20, 2017 at 02:42:01PM -0800, Eric Hassold wrote:
>> ----
>> [PATCH] set up guard protection after stack is mapped
>>
>> calling mprotect beyond the guard page of PROT_NONE mmap-ed stack fails on
>> some devices with buggy kernel, making it impossible to create
>> thread. if such condition is met, fall back to allocating memory
>> for stack and guard first, with read and write permission, then
>> protect guard
>> area.
>> ---
>>   src/thread/pthread_create.c | 12 +++++++++++-
>>   1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/thread/pthread_create.c b/src/thread/pthread_create.c
>> index 49f2f72..d3c030b 100644
>> --- a/src/thread/pthread_create.c
>> +++ b/src/thread/pthread_create.c
>> @@ -242,7 +242,17 @@ int __pthread_create(pthread_t *restrict res,
>> const pthread_attr_t *restrict att
>>               if (__mprotect(map+guard, size-guard, PROT_READ|PROT_WRITE)
>>                   && errno != ENOSYS) {
>>                   __munmap(map, size);
>> -                goto fail;
>> +                if (errno == EINVAL) {
> In principle __munmap could have clobbered errno at this point. I
> don't think it will but to be safe errno should probably be saved.
>
>> +                    map = __mmap(0, size, PROT_READ|PROT_WRITE,
>> MAP_PRIVATE|MAP_ANON, -1, 0);
>> +                    if (map == MAP_FAILED) goto fail;
>> +                    if (__mprotect(map, guard, PROT_NONE)
>> +                        && errno != ENOSYS) {
>> +                        __munmap(map, size);
>> +                        goto fail;
>> +                                        }
>> +                } else {
>> +                    goto fail;
>> +                                }
> Can you try instead falling back to mmap with MAP_FIXED over top of
> the part that should be RW rather than unmapping and redoing the whole
> thing? If that works it would preserve the property of avoiding excess
> commit charge. If not then these kernels are very very broken. Either
> way it might shed some more light on what's going on. I understand you
> just want things to work, but I do also want to have some
> understanding of _why_ we're going to be working around something
> awful, if we really have to.
>
> Rich

mmap(MAP_FIXED) at the address returned by first mmap(PROT_NONE) and 
then mprotect() indeed works.

BUT... actually kept exploring the idea that this might just be an issue 
with page alignment. And found out that forcing the guard size to be 
32k-aligned make current approach work well. So checked and found out 
that on systems where issue happens, the actual page size determined 
dynamically at runtime (using "getinfo PAGESIZE", and confirmed using 
ratio of Mapped memory / nr_mapped pages from /proc/vmstat) is indeed 
32k, whereas Musl has an hardcoded at compile time 4k page size for arm 
architecture (with same eventually incorrect value returned by 
getpagesize() and sysconf(__SC_PAGESIZE)).

Long story short, no odd behavior of the mmap in kernel, but Musl not 
supporting correctly non-4k page systems, impacting pthread_create but 
probably also malloc and mprotect. Given non-4k page systems are getting 
more and more common (on arm and arm64 architectures at least), this 
might be an important issue to fix. Should I start new "support for 
non-4k page systems" discussion here ?

Eric
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.