musl - Re: Re: FYI: some observations when testing next-gen malloc

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c75402c7-dde3-d5ea-d172-176a4518221e@wwcom.ch>
Date: Wed, 11 Mar 2020 09:54:18 +0100
From: Pirmin Walthert <pirmin.walthert@...om.ch>
To: musl@...ts.openwall.com
Subject: Re: Re: FYI: some observations when testing next-gen malloc


Am 11.03.20 um 01:47 schrieb Rich Felker:
> On Tue, Mar 10, 2020 at 11:06:57AM +0100, Szabolcs Nagy wrote:
>> * Pirmin Walthert<pirmin.walthert@...om.ch>  [2020-03-10 10:44:46 +0100]:
>>> Am 09.03.20 um 19:55 schrieb Szabolcs Nagy:
>>>> * Pirmin Walthert<pirmin.walthert@...om.ch>  [2020-03-09 19:14:59 +0100]:
>>>>> Am 09.03.20 um 18:12 schrieb Rich Felker:
>>>>>> It's not described very rigorously, but effectively it's in an async
>>>>>> signal context and can only call functions which are AS-safe.
>>>>>>
>>>>>> A future version of the standard is expected to drop the requirement
>>>>>> that fork itself be async-signal-safe, and may thereby add
>>>>>> requirements to synchronize against some or all internal locks so that
>>>>>> the child can inherit a working context. But the right solution here is
>>>>>> always to stop using fork without exec.
>>>>>>
>>>>>> Rich
>>>>> Well, I have now changed the code a bit to make sure that no
>>>>> async-signal-unsafe command is being executed before execl. Things I've
>>>>> removed:
>>>>>
>>>>> a call to cap_from_text, cap_set_proc and cap_free has been removed as well
>>>>> as sched_setscheduler. Now the only thing being executed before execl in the
>>>>> child process is closefrom()
>>>> closefrom is not as-safe.
>>>>
>>>> i think it reads /proc/self/fd directory to close fds.
>>>> (haven't checked the specific asterisk version)
>>>> opendir calls malloc so it can deadlock.
>>>>
>>> Indeed I am not able to reproduce the problem any longer with a modified
>>> version of asterisk. What I've changed is:
>>>
>>> Removed some code that sets the capabilities after fork() (with
>>> cap_from_text, cap_set_proc, cap_free) and closefrom replaced with a thumb
>>> loop over all possible fds up to sysconf(_SC_OPEN_MAX). With this
>>> modification the fd closing procedure with max open files set to 21471 now
>>> needs 7ms instead of 70ns (so a slowdown by times 100), however this is not
>>> critical in our environment...
>>>
>>> Will discuss the findings with the asterisk developers.
>>>
>>> Thanks for your hints!
>> good.
>>
>> ideally they would use close-on-exec fds and then
>> you don't need such ugliness.
>>
>> please don't drop the list from replies.
> While indeed the right thing is not to do is a closefrom/closeall hack
> at all, if you really can't fix the fd leaks and need to, there is a
> fast but safe version that doesn't require listing /proc/self/fd.
> Instead, call poll() with a large array of pollfd with .events = 0 for
> each element, and zero timeout, and check the .revents of each for
> POLLNVAL. This will tell you with very few syscalls (probably just 1)
> which file descriptors are open.
>
> Rich

First of all: the quick&dirty solution was never meant to be a real 
solution at all, it was more like a proof of concept to show that the 
deadlocks do not happen when removing the non-async-signal-safe 
syscalls. So when I opened the asterisk bug report yesterday I'd already 
mentioned that I would not consider this to be the final solution/right 
approach ...

But: thanks a lot, this is a nice solution, it improved the performance 
by times ten (~800ns instead of 8000ns, but still slower than the 
original approach which needed 70ns (Measured with defining the array 
after the fork, this seems to be the most time consuming part)).

Funnily the quick&dirty loop approach is the one you'll find in most 
programs and forums... In my own software I usually keep track of the 
biggest fd and simply close up to this one what should work well enough 
as long as the process is not forked from several threads.

While I am not an asterisk core dev (only submitted a few patches in the 
past), I assume that the fds are simply closed to ensure that the 
command executed with exec will not have access to any fds it shouldn't 
have, so simply for security reasons.

However, here is what I've done now based on your input (asterisk can 
have lots of fds, especially with SIP over TCP. So for big systems they 
usually recommend to set the max open files to a value > 1024. However: 
with 1024 for POLL_SIZE I got the best performance, even though I needed 
to do a few more syscalls this way):

                 #define POLL_SIZE 1024

                 int maxfd = sysconf (_SC_OPEN_MAX);
                 if (maxfd == -1)
                         maxfd = 1024;
                 if (maxfd > 65536)
                         maxfd = 65536;
                 struct pollfd fds[POLL_SIZE];
                 int fd=STDERR_FILENO+1, loopmax, i;
                 while (fd<maxfd) {
                         loopmax = maxfd - fd;
                         if (loopmax>POLL_SIZE) {
                                 loopmax = POLL_SIZE;
                         }
                         for (i=0;i<loopmax;i++) {
                                 fds[i].fd = fd+i;
                                 fds[i].events = 0;
                         }
                         poll(fds, loopmax, 0);
                         for (i=0;i<loopmax;i++) {
                                 if (fds[i].revents == POLLNVAL) {
                                         continue;
                                 }
                                 close(fds[i].fd);
                         }
                         fd+=loopmax;
                 }
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.