musl - Re: Re: realloci(): A realloc() variant that works in-place

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251103212838.GY1827@brightrain.aerifal.cx>
Date: Mon, 3 Nov 2025 16:28:38 -0500
From: Rich Felker <dalias@...c.org>
To: Alejandro Colomar <alx@...nel.org>
Cc: Thiago Macieira <thiago@...ieira.org>,
	Florian Weimer <fw@...eb.enyo.de>, libc-alpha@...rceware.org,
	musl@...ts.openwall.com, Arthur O'Dwyer <arthur.j.odwyer@...il.com>,
	Jonathan Wakely <jwakely@...hat.com>
Subject: Re: Re: realloci(): A realloc() variant that works in-place

On Mon, Nov 03, 2025 at 10:36:07AM +0100, Alejandro Colomar wrote:
> Hi Rich,
> 
> On Sun, Nov 02, 2025 at 07:28:57PM -0500, Rich Felker wrote:
> > On Mon, Nov 03, 2025 at 12:58:39AM +0100, Alejandro Colomar wrote:
> > > > All this will need fine-tuning once implementations exist.
> > > > 
> > > > > So, why not require the caller to not ask too much?  We could go back to
> > > > > reporting an error if there's not enough memory.
> > > > > 
> > > > > Of course, it would still guarantee no errors when shrinking, but
> > > > > I think we could error out when growing.
> > > > 
> > > > I'd prefer no errors either way. If there isn't memory to grow the underlying 
> > > > space (a brk() system call returns ENOMEM), then realloci() returns as much as 
> > > > it could get but not more.
> > > 
> > > The problem is that this is asking the implementation to speculate.
> > > 
> > > Consider the case that a realloci() implementation knows that the
> > > requested size fails.  Let's put some arbitrary numbers:
> > > 
> > > 	old_size = 10000;
> > > 	requested_size = 30000;
> > > 
> > > It knows the block can grow to somewhere between 10000 (which it
> > > currently has) and 30000 (the system reported ENOMEM), but now it has
> > > the task of allocating as much as it can get.  Should it do a binary
> > > search of the size?  Try 20000, then if it fails try 15000, etc.?
> > > That's speculation, and it would make this function too slow.
> > 
> > I don't see any plausible implementation in which this involved a
> > binary search. Either you have fixed-size slots in which case you just
> > look at the size of the slot to see what the max obtainable is, or you
> > have a dlmalloc-like situation where you check the size of the
> > adjacent free block (if any) to determine the max obtainable. These
> > are O(1) operations.
> 
> I was thinking of mremap(2) without MREMAP_MAYMOVE.

OK, this whole conversation is mixing up unrelated things:

1. In-place realloc to avoid relatively-expensive memcpy
2. In-place realloc to avoid updating pointers

The case where mremap would be used is utterly irrelevant to (1). And
further, the cost of the mremap operation is so high (syscall
overhead, page table/TLB synchronization) that any cost of updating
pointers because the object moved is dwarfed and thereby irrelevant
too.

So I don't see why anyone should care about this case.

Moreover, I see (2) as entirely misguided. The whole provenance model
makes it broken to try to rely on pointer values not changing, and no
code should be trying to do that. A new allocator interface should not
be pandering to this very fragile, very likely to be broken by
compiler transformations, utterly backwards practice. Just treat the
old pointer as invalid and always update like you're supposed to,
regardless of whether the value is different.

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.