Date: Sat, 4 Apr 2020 22:20:23 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: New malloc tuning for low usage On Sat, Apr 04, 2020 at 02:19:48PM -0400, Rich Felker wrote: > On Fri, Apr 03, 2020 at 10:55:54PM -0400, Rich Felker wrote: > > In working on this, I noticed that it looks like the coarse size class > > threshold (6) in top-level malloc() is too low. At that threshold, the > > first fine-grained-class group allocation will be roughly a 100% > > increase in memory usage by the class; I'd rather keep the relative > > increase bounded by 50% or less. It should probably be something more > > like 10 or 12 to achieve this. With 12, repeated allocations of 16k > > first produce 7 individual 20k mmaps, then a 3-slot class-37 > > (21824-byte slots) group, then a 7-slot class-36 (18704-byte slots) > > group. > > > > One thing that's not clear to me is whether it's useful at all to > > produce the 3-slot class-37 group rather than just going on making > > more individual mmaps until it's time to switch to the larger group. > > It's easy to tune things to do the latter, and seems to offer more > > flexibility in how memory is used. It also allows slightly more > > fragmentation, but the number of such objects is highly bounded to > > begin with because we use increasingly larger groups as usage goes up, > > so the contribution should be asymptotically irrelevant. > > The answer is that it depends on where the sizes fall. At 16k, > rounding up to page size produces 20k usage (5 pages) but the 3-slot > class-37 group uses 5+1/3 pages, so individual mmaps are preferable. > However if we requested 20k, individual mmaps would be 24k (6 pages) > while the 3-slot group would still just use 5+1/3 page, and would be > preferable to switch to. The condition seems to be just whether the > rounded-up-to-whole-pages request size is larger than the slot size, > and we should prefer individual mmaps if (1) it's smaller than the > slot size, or (2) using a multi-slot group would be a relative usage > increase in the class of more than 50% (or whatever threshold it ends > up being tuned to). > > I'll see if I can put together a quick implementation of this and see > how it works. This seems to be working very well with the condition: if (sc >= 35 && cnt<=3 && (size*cnt > usage/2 || ((req+20+pagesize-1) & -pagesize) <= size)) ^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ at least ~16k wanted to make a smaller requested size group but hit lower cnt rounded up to limit; see loop above page <= slot size at the end of the else clause for if (sc < 8) in alloc_group. Here req is a new argument to expose the size of the actual request malloc made, so that for single-slot groups (mmap serviced allocations) we can allocate just the minimum needed rather than the nominal slot size. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.