Date: Fri, 15 May 2020 23:29:01 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: mallocng progress and growth chart On Fri, May 15, 2020 at 08:29:13PM -0400, Rich Felker wrote: > The same thing happens at the next doubling for malloc(8192), and the > same mitigation applies. However with that: > > 9340: 3x10912 3x10912 3x10912 3x10912 3x9344 7x9344 ... > > the coarse size classing is dubious because the size is sufficiently > large that a 7->3 count reduction can be used, with the same count the > coarsse class would have, but with a 28k rather than 32k mapping. > > Unfortunately the decision here depends on knowing page size, which > isn't constant at the point where it needs to be made. For integration > with musl, page size is initially known even if it's variable, so we > could possibly make a decision not to use coarse sizing based on that > here, but standalone mallocng lacks that knowledge (page size isn't > known until after first alloc_meta). This might could be reworked. > There's a fairly small range of sizes that would benefit (larger ones > are better off with individual mmap because page size quickly becomes > "finer than" size classes), but the benefit seems fairly significant > (not wasting an extra 1.3k each for the first 12 malloc(8192)'s) at > the sizes where it helps. It might just make sense to always disable coarse size classing starting at this size. The absolute amount of over-allocation is sufficiently high that it's probably not justified. On archs with larger page size, more pages may be mapped (e.g. 7x9344 can't be reduced to 3x if page size is over 4k, and can't be reduced to 5x if page size is over 16k) but having too much memory mapped is generally an expected consequence of ridiculous page sizes. Another possibly horrible idea for dealing with exact page sizes at low usage: pair groups of short length. Rather than needing to choose between coarse classing 3x5440 (16k) or a 5x4672 (5 whole slots, 24k) for a malloc(4095), at low useage we could create a pseudo-2x4672 where the second slot is truncated and contains a nested 1x3264 that's only freeable together with the outer group. This relation always works: if size class k is just under a power of two (k == 3 mod 4), classes k+1 and k-1 add up to just under 2 times class k. (This follows from n/5 + 2n/7 == (7n+10n)/35 == 17n/35 <= n/2 == 2*n/4, where n/5, 2n/7, and n/4 are the sizes of class k-1, k+1, and k, respectively.) This gives a strategy that always works for allocating very-low-count of arbitrary size classes, as long as we're willing to allocate a slot for the complementary size at the same time. And in some ways it's nicer than coarse classing -- rather than overallocating the requested slot in hopes that the grouped slots will be useful for larger allocations too, it allocates a smaller complementary pair in hopes that the complementary size will be useful.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.