musl - Re: holywar: malloc() vs. OOM

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110724182533.GB6429@albatros>
Date: Sun, 24 Jul 2011 22:25:33 +0400
From: Vasiliy Kulikov <segoon@...nwall.com>
To: musl@...ts.openwall.com
Subject: Re: holywar: malloc() vs. OOM

Rich,

On Sun, Jul 24, 2011 at 08:40 -0400, Rich Felker wrote:
> On Sun, Jul 24, 2011 at 02:33:25PM +0400, Vasiliy Kulikov wrote:
> > Rich,
> > 
> > This is more a question about your malloc() failure policy for musl than
> > an actual proposal.
> > 
> > [...]
> > 
> > In theory, these are bugs of applications and not of libc, and they
> > should be fully handled in programs, not in libc.  Period.
> > 
> > But looking at the problem from the pragmatic point of view we'll see
> > that libc is actually the easiest place where the problem may be 
> > workarounded (not fixed, surely).  The workaround would be simply
> > raising SIGKILL if malloc() fails (either because of brk() or mmap()).
> > For the rare programs craving to handle OOM such code should be used:
> 
> This is absolutely wrong and non-conformant. It will also ruin all
> robust programs and result in massive data loss, deadlock with shared
> locks due to failure to release locks before termination, and all
> sorts of ills.

Oh, I forgot one major detail - the kernel by default have memory
overcommit enabled (sysctl vm.overcommit_memory=0).  It means that even
root owned program may be killed by OOM killer in case of system global
OOM :-)  There are procfs adjustments for such processes, but the
history shows that OOM killer logic is often somehow unexpected (if not
broken).  Also it was rewritten almost from scratch in the latest
kernels, so I'd expect new bugs in it.

For overcommit disabled OOM graceful handling should be possible, but
I'm not sure it is _guaranteed_ that memory allocated by brk() and
mmap() will be really available in the future.

So, yes, if the program guarantees that it gracefully handle OOM *for
sure*, then the workaround is indeed a breakage.  But I'm sure such
programs are hell rare.  BTW, do you know such programs, except DBUS? :)


> The only common situation I can think of where it
> might happen to initially access a high offset first is when calling
> glibc's memcpy which sometimes chooses to copy backwards. musl's
> memcpy does not take this liberty, even if it might be faster in some
> cases, for that very reason - it's dangerous to access high offsets
> first if a program was not careful about checking the return value of
> malloc.

Also the program/libs might (re)implement such functions for the
performance gain.


> A better solution might be to have a gcc option to generate a read
> from the base address the first time a function performs arithmetic on
> a pointer it has not already checked. This is valid because the C
> language does not allow pointer arithmetic to cross object boundaries,
> and this approach could be made 100% correct rather than being a
> heuristic that breaks correct applications.

A good idea.  It would be interesting to show actual numbers of the
slowdown.  However, most of the time it would be a slowdown for no
actual gain.

Thanks,

-- 
Vasiliy
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.