musl - Re: mallocng progress and growth chart

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200517033025.GQ21576@brightrain.aerifal.cx>
Date: Sat, 16 May 2020 23:30:25 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: mallocng progress and growth chart

On Fri, May 15, 2020 at 11:29:01PM -0400, Rich Felker wrote:
> On Fri, May 15, 2020 at 08:29:13PM -0400, Rich Felker wrote:
> > The same thing happens at the next doubling for malloc(8192), and the
> > same mitigation applies. However with that:
> > 
> >  9340:   3x10912  3x10912  3x10912  3x10912  3x9344   7x9344   ...
> > 
> > the coarse size classing is dubious because the size is sufficiently
> > large that a 7->3 count reduction can be used, with the same count the
> > coarsse class would have, but with a 28k rather than 32k mapping.
> > 
> > Unfortunately the decision here depends on knowing page size, which
> > isn't constant at the point where it needs to be made. For integration
> > with musl, page size is initially known even if it's variable, so we
> > could possibly make a decision not to use coarse sizing based on that
> > here, but standalone mallocng lacks that knowledge (page size isn't
> > known until after first alloc_meta). This might could be reworked.
> > There's a fairly small range of sizes that would benefit (larger ones
> > are better off with individual mmap because page size quickly becomes
> > "finer than" size classes), but the benefit seems fairly significant
> > (not wasting an extra 1.3k each for the first 12 malloc(8192)'s) at
> > the sizes where it helps.
> 
> It might just make sense to always disable coarse size classing
> starting at this size. The absolute amount of over-allocation is
> sufficiently high that it's probably not justified. On archs with
> larger page size, more pages may be mapped (e.g. 7x9344 can't be
> reduced to 3x if page size is over 4k, and can't be reduced to 5x if
> page size is over 16k) but having too much memory mapped is generally
> an expected consequence of ridiculous page sizes.
> 
> Another possibly horrible idea for dealing with exact page sizes at
> low usage: pair groups of short length. Rather than needing to choose
> between coarse classing 3x5440 (16k) or a 5x4672 (5 whole slots, 24k)
> for a malloc(4095), at low useage we could create a pseudo-2x4672
> where the second slot is truncated and contains a nested 1x3264 that's
> only freeable together with the outer group. This relation always
> works: if size class k is just under a power of two (k == 3 mod 4),
> classes k+1 and k-1 add up to just under 2 times class k. (This
> follows from n/5 + 2n/7 == (7n+10n)/35 == 17n/35 <= n/2 == 2*n/4,
> where n/5, 2n/7, and n/4 are the sizes of class k-1, k+1, and k,
> respectively.)
> 
> This gives a strategy that always works for allocating very-low-count
> of arbitrary size classes, as long as we're willing to allocate a slot
> for the complementary size at the same time. And in some ways it's
> nicer than coarse classing -- rather than overallocating the requested
> slot in hopes that the grouped slots will be useful for larger
> allocations too, it allocates a smaller complementary pair in hopes
> that the complementary size will be useful.

Another alternative for avoiding eagar commit at low usage, which
works for all but nommu: when adding groups with nontrivial slot count
at low usage, don't activate all the slots right away. Reserve vm
space for 7 slots for a 7x4672, but only unprotect the first 2 pages,
and treat it as a group of just 1 slot until there are no slots free
and one is needed. Then, unprotect another page (or more if needed to
fit another slot, as would be needed at larger sizes) and adjust the
slot count to match. (Conceptually; implementation-wise, the slot
count would be fixed, and there would just be a limit on the number of
slots made avilable when transformed from "freed" to "available" for
activation.)

Note that this is what happens anyway with physical memory as clean
anonymous pages are first touched, but (1) doing it without explicit
unprotect over-counts the not-yet-used slots for commit charge
purposes and breaks tightly-memory-constrained environments (global
commit limit or cgroup) and (2) when all slots are initially available
as they are now, repeated free/malloc cycles for the same size will
round-robin all the slots, touching them all.

Here, property (2) is of course desirable for hardening at moderate to
high usage, but at low usage UAF tends to be less of a concern
(because you don't have complex data structures with complex lifetimes
if you hardly have any malloc).

Note also that (2) could be solved without addressing (1) just by
skipping the protection aspect of this idea and only using the
available-slot-limiting part.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.