Date: Sun, 24 Jul 2011 14:33:25 +0400 From: Vasiliy Kulikov <segoon@...nwall.com> To: musl <musl@...ts.openwall.com> Subject: holywar: malloc() vs. OOM Rich, This is more a question about your malloc() failure policy for musl than an actual proposal. When brk() or mmap() fails, libc usually returns NULL to the program. If the program wants to gracefully handle OOM, it may do it. If not, it will likely generate SIGSEGV and will be killed without any problem. However, there are potential issues with this behaviour: 1) If the program doesn't handle OOM at all, it can lead to security problems. a) if NULL page is not mmap'ed (the case of all nonroot apps and most of root apps), the page starting from vm.mmap_min_addr still may present in the process' vm. For some distros only one page was guarded this way in the past. So, if the allocation is bigger than ~4-64kb, and the write begins from the end of the page, then some bad things may happen before SIGSEGV (the worst case is privilege escalation). This is a patological case, I didn't see such cases myself, but it's possible in theory. b) if NULL page is mmap'ed, the application might not identify OOM at all as the page is mmap'ed and SIGSEGV is not sent. (Yes, apps mmap'ing NULL page must handle OOM, but see (2).) 2) If the program handle OOM, it might do it very bad way. The OOM handling code path is almost always not tested and contain bugs. Even the kernel, which obviously must handle OOM, doesn't properly handle it (I found bugs in OOM handling code much more often than in other error handling code) because this code is not tested. DBUS daemon, which is closely connected with init in modern distros, must not fail on OOM by design (otherwise init would fail and the whole system would hang/reboot), and it took much time to remove silent bugs in this code: (http://blog.ometer.com/2008/02/04/out-of-memory-handling-d-bus-experience/) In theory, these are bugs of applications and not of libc, and they should be fully handled in programs, not in libc. Period. But looking at the problem from the pragmatic point of view we'll see that libc is actually the easiest place where the problem may be workarounded (not fixed, surely). The workaround would be simply raising SIGKILL if malloc() fails (either because of brk() or mmap()). For the rare programs craving to handle OOM such code should be used: #define _OOM_MAY_FAIL_ #include <stdlib.h> Then the workaround is disabled. Probably I overestimate the importance of OOM errors, and (1) in particular. However, I think it is worth discussing. Thanks, -- Vasiliy
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.