musl - Re: mallocng progress and growth chart

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cde945ea-54ea-f47a-3e74-313eec10d844@wwcom.ch>
Date: Mon, 25 May 2020 20:13:02 +0200
From: Pirmin Walthert <pirmin.walthert@...om.ch>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: mallocng progress and growth chart

Am 25.05.20 um 19:54 schrieb Rich Felker:

> On Mon, May 25, 2020 at 05:45:33PM +0200, Pirmin Walthert wrote:
>> Am 18.05.20 um 20:53 schrieb Rich Felker:
>>
>>> On Sat, May 16, 2020 at 11:30:25PM -0400, Rich Felker wrote:
>>>> Another alternative for avoiding eagar commit at low usage, which
>>>> works for all but nommu: when adding groups with nontrivial slot count
>>>> at low usage, don't activate all the slots right away. Reserve vm
>>>> space for 7 slots for a 7x4672, but only unprotect the first 2 pages,
>>>> and treat it as a group of just 1 slot until there are no slots free
>>>> and one is needed. Then, unprotect another page (or more if needed to
>>>> fit another slot, as would be needed at larger sizes) and adjust the
>>>> slot count to match. (Conceptually; implementation-wise, the slot
>>>> count would be fixed, and there would just be a limit on the number of
>>>> slots made avilable when transformed from "freed" to "available" for
>>>> activation.)
>>>>
>>>> Note that this is what happens anyway with physical memory as clean
>>>> anonymous pages are first touched, but (1) doing it without explicit
>>>> unprotect over-counts the not-yet-used slots for commit charge
>>>> purposes and breaks tightly-memory-constrained environments (global
>>>> commit limit or cgroup) and (2) when all slots are initially available
>>>> as they are now, repeated free/malloc cycles for the same size will
>>>> round-robin all the slots, touching them all.
>>>>
>>>> Here, property (2) is of course desirable for hardening at moderate to
>>>> high usage, but at low usage UAF tends to be less of a concern
>>>> (because you don't have complex data structures with complex lifetimes
>>>> if you hardly have any malloc).
>>>> c
>>>> Note also that (2) could be solved without addressing (1) just by
>>>> skipping the protection aspect of this idea and only using the
>>>> available-slot-limiting part.
>>> One abstract way of thinking about the above is that it's just a
>>> per-size-class bump allocator, pre-reserving enough virtual address
>>> space to end sufficiently close to a page boundary that there's no
>>> significant memory waste. This is actually fairly elegant, and might
>>> obsolete some of the other measures taken to avoid overly eagar
>>> allocation. So this might be a worthwhile direction to pursue.
>> Dear Rich,
>>
>> Currently we use mallocng in production for most applications in our
>> "embedded like" virtualised system setups, it even helped to find
>> some bugs (for example in asterisk) as mallocng was less forgiving
>> than the old malloc implementation. So if you're interested in real
>> world feedback: everything seems to be running quite smoothly so
>> far, thanks for this great work.
>>
>> Currently we use the git version of April 24th, so the version
>> before you merged the huge optimization changes. As you mentioned in
>> your "brainstorming mails", if I got them right, that you might
>> rethink a few of these changes, I'd like to ask: do you think it
>> would be better to use the current git-master version rather than
>> the version of April 24th (we are not THAT memory constrained, so
>> stability is the most important thing) or do you think it would be
>> better to stick on the old version and wait for the next changes to
>> be merged?
> Thanks for the feedback!
>
> Which are the "huge optimization changes" you're wondering about?
> Indeed there's a large series of commits after the version you're
> using but I think you're possibly misattributing them.
>
> A number of the commits are bug fixes -- mostly not for hard bugs, but
> for unwanted and unintended behaviors:
>
> a709dde fix unexpected allocation of 7x144 group in non-power-of-two slot
> dda5a88 fix exact size tracking in memalign
> 915a914 adjust several size classes to fix nested groups of non-power-of-2 size
> 7acd61e allow in-place realloc when ideal size class is off-by-one
> caca917 add support for aligned_alloc alignments 1M and over
>
> There were also quite a few around an idea that didn't go well and was
> mostly reverted, but with major improvements to the original behavior:
>
> 5bff93c overhaul bounce counter to work with map sizes instead of size classes
> 71262cd tune bounce counter to avoid triggering early
> 9601aaa prevent overflow of unmap counter part of bounce counter
> aca1f32 don't let the mmap cache limit grow unboundedly or overflow
> 6fbee31 second partial overhaul of bounce counter system
> 150de6e revert from map cache to old okay_to_free scheme, but improved
> 1e972da initial conversion of bounce counting to use sequence numbers, decay
> e3eecb1 factor bounce/sequence counter logic into meta.h
> 6693738 account seq for individually-mmapped allocations above hard threshold
> 4443f64 fix complete regression (malloc always fails) on variable-pagesize archs
>
> If you don't care about low usage, that whole change series is fairly
> unimportant, but should be harmless. It just changes decisions about
> choices where either choice produces as valid state for the allocator
> but there are tradeoffs between memory usage and performance. The new
> behavior should be better, though.
>
> A few commits were reordering the dependency between memalign and the
> standard memalign-variant functions, which is a minor namespace
> detail:
>
> da4c88e rename aligned_alloc.c
> 04407f7 reverse dependency order of memalign and aligned_alloc
> 74e6657 rename aligned_alloc source file back to its proper name
> c990cb1 rename memalign source file back to its proper name
>
> A couple were hardening:
>
> 5bf4e92 clear group header pointer to meta when freeing groups
> bd04c75 in get_meta, check offset against maplen (minor hardening)
> 77cea57 add support for allocating meta areas via legacy brk
>
> And pretty much all the rest of the changes are tuning behavior for
> "optimization" of some sort or another, which may be what you were
> referring to:
>
> 26143c4 limit slot count growth to 25% instead of 50% of current usage in class
> a9187f0 remove unnecessary optimization tuning flags from Makefile CFLAGS
> 045cc04 move coarse size classing logic to malloc fast path
> 8348a82 eliminate med_twos_tab
> e619034 allow slot count 1 for size classes 3 mod 4 as natural/first-class
> c9d54f4 activate coarse size classing for small classes down to 4 (but not 6)
> 44092d8 improve individual-mmap decision
> d355eaf remove slot count reduction to 1 for size classes 1 mod 3
> c555ebe fix off-by-one in logic to use single-slot groups
> 9d5ec34 switch from MADV_DONTNEED to MADV_FREE for large free slots
> 584c7aa avoid over-use of reduced-count groups due to coarse size classing
> f9bfb0a increase threshold for 3->2 slot reduction to 16 pages
> 20da09e disable coarse size classing for large classes (over 8k)
>
> I don't think any of these changes are potentially obsoleted by
> further ideas in the above thread. I am working on delaying activation
> of slots until they're actually needed, so that we don't dirty pages
> we could avoid touching, but I proposed this as an alternative to
> other more complex tricks that I didn't really like, which have not
> been implemented and probably won't be now.
>
> So, in summary, I don't see any good reason not to go with latest.
>
> Rich

Many thanks for your detailed answer. I'll give it a try then!

Pirmin
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.