musl - Re: [RFC] add realloci() and reallocarrayi()

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251103013038.GX1827@brightrain.aerifal.cx>
Date: Sun, 2 Nov 2025 20:30:38 -0500
From: Rich Felker <dalias@...c.org>
To: Markus Wichmann <nullplan@....net>
Cc: musl@...ts.openwall.com, Alejandro Colomar <alx@...nel.org>,
	"A. Wilcox" <AWilcox@...cox-tech.com>,
	Lénárd Szolnoki <cpp@...ardszolnoki.com>,
	Thorsten Glaser <tg@...bsd.de>,
	Collin Funk <collin.funk1@...il.com>
Subject: Re: [RFC] add realloci() and reallocarrayi()

On Sat, Nov 01, 2025 at 10:41:05PM +0100, Markus Wichmann wrote:
> Am Fri, Oct 31, 2025 at 03:01:58PM +0100 schrieb Alejandro Colomar:
> > Hi,
> > 
> > I have not tested this patch (not even tried building yet).  It's just
> > an initial draft, requesting comments about the overall idea.
> > 
> > So far, I've only implemented this in the mallocng implementation, as
> > the oldmalloc implementation was too weird for me to fully understand.
> > Do we need an old implementation too?
> > 
> 
> Whether we need one or not probably depends on if the proposal is
> accepted by the C standard, which, judging from the other thread, is
> currently up in the air.
> 
> Oldmalloc is a pretty simple malloc implementation. On the high level,
> there are two allocators: One for large allocations (above
> MMAP_THRESHOLD) and one for smaller allocations.
> 
> The large allocations use mmap() directly. They are identified at
> runtime by having even used chunks not have the C_INUSE flag set for
> their size. The psize indicates the offset to the start of the mapped
> page, for alignment reasons. For large allocations, realloci() could
> probably be implemented by calling mremap() *without* MREMAP_MAYMOVE.
> 
> Small allocations use a linked list approach: Each chunk knows its size
> and its predecessor's size. Memory blocks are terminated with zero-sized
> pseudo-chunks that are always in use. Additionally, free chunks are
> added to doubly-linked lists called "bins", where they are classified by
> size. The specifics don't really matter for realloci(). You can probably
> implement it by combining the current chunk with its successors until
> either the size requirement is met or you run into a chunk that is in
> use. At the end, you can also split the chunk to shrink it to fit the
> request exactly, and leave the rest over to future allocations.
> 
> For requests that jump the boundary between those two allocators: I
> don't really see the harm in having a chunk start out as mmapped chunk
> and become remapped to be smaller. Obviously, the effective size cannot
> shrink below 1 page (minus whatever alignment overhead was reserved),
> but just having that page dangle about and eventually be unmapped by
> free(), though maybe inefficient, is probably not harmful.

It is very harmful. It causes catastrophic fragmentation and
exhaustion of vma limit under otherwise very reasonable patterns.
There is no good reason to try to keep such reallocs "in-place". The
existing strategy of malloc-memcpy-free is very intentional.

> The other direction, Rich has in the past spoken out about how there
> should not be allocated managed chunks larger than MMAP_THRESHOLD in
> size. I don't quite see the issue myself (allocated chunks only matter
> in the chunk list in each block, where the size class doesn't matter,
> and freed chunks are joined with their free predecessors and successors
> to possibly form a size above MMAP_THRESHOLD anyway), but perhaps I am
> missing something.

In any case, oldmalloc is basically only there because there's no good
reason to remove it. I'm not particularly interested in adding new
code to it. A perfectly reasonable realloci (or whatever)
implementation for it would just be one that always fails.

Overall, I am so far utterly unconvinced that any of this proposal is
well-motivated. It seems like grasping at straws trying to mitigate
flaws in STL implementations' resizing strategies that can't really be
fixed at this layer (and that would require major revamps on their
size to use anything new from this layer), but that could be largely
or entirely fixed on the STL side just by being smarter about the
strategy for choosing sizes. As I've said multiple times, being able
to do significant enlargement of an object in-place is, asymptotically
in the presence of large malloc load, always going to fail. At best
you can reclaim some slack from the gaps between the allocator's size
granularity scale and the calling application's (or STL's) choice of
growth strategy increments.

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.