Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <thbvcyc2dmafx76g426h6sfyuqhgrq4d2wtgxtb3v6lawf2nbm@frb3rn3q66lk>
Date: Sat, 1 Nov 2025 01:47:33 +0100
From: Alejandro Colomar <alx@...nel.org>
To: Thiago Macieira <thiago@...ieira.org>
Cc: Paul Eggert <eggert@...ucla.edu>, libc-alpha@...rceware.org, 
	musl@...ts.openwall.com, "A. Wilcox" <AWilcox@...cox-tech.com>, 
	Lénárd Szolnoki <cpp@...ardszolnoki.com>, Collin Funk <collin.funk1@...il.com>, 
	Arthur O'Dwyer <arthur.j.odwyer@...il.com>, Jonathan Wakely <jwakely@...hat.com>, 
	"Paul E. McKenney" <paulmck@...nel.org>
Subject: Re: Re: realloci(): A realloc() variant that works in-place

Hi Thiago,

On Fri, Oct 31, 2025 at 04:48:36PM -0700, Thiago Macieira wrote:
> On Friday, 31 October 2025 15:09:46 Pacific Daylight Time Alejandro Colomar 
> wrote:
> > > I'd add: if the new size is smaller than the old size, the bytes in that
> > > storage are undefined, even if this function returned -1. That will allow
> > > an implementation to MADV_DONTNEED the space, even if it can't officially
> > > change the size of the allocation.
> > 
> > I'm not entirely sure.  What would be the new size?  Would it still be
> > the old one?  So, the higher contents are undefined but you're still
> > able to write to them?  It sounds weird.
> 
> I think this needs some discussion.

Sure.

> I'm thinking of allocators like jemalloc that cannot reuse the space freed by 
> shrinkage. Imagine shrinking a block of 256 kB to 64 B (e.g., 8192 
> std::strings to 2).
> 
> What does realloci() return?

If it can't reuse the space, I think the most sensible thing to return
would be the original large size.  That would indicate the user the
most information.

> It could return -1, indicating no shrinking happened.

I wouldn't do that.  I would reserve -1 for indicating a hard error,
such as not being able to grow.

Think that users might fall back to realloc(3) as soon as they see a -1.
They may not remember the original size, so they may not know they're
trying to shrink.

> In that case, the higher 
> layer is allowed to presume the data it had there is still there, which 
> prevents the allocator from doing madvise(MADV_DONTNEED).
> 
> Or it could return 0, indicating it did shrink and may have done 
> MADV_DONTNEED.

Yep.  Although with the new specification, it'd return the large size.

> But in that case, the higher layer will update its book-keeping 
> of the capacity, causing it to call realloci() again if it needs to grow 
> again. Though this will probably be fast: the allocator will probably just 
> return 0 for any size value that is less than the slab size and -1 for any 
> that is bigger. The drawback of this is that there's a minimum granularity of 
> one page, so the example above of shrinking to 64 B is keeping 4032 bytes 
> "hostage" in overhead.

Yep.  I guess not too bad.  If they want to release it, they're always
free to call realloc(3).  Of course, they'll need to know that this size
is hostage, so returning the actual size is useful.

> > The rationale is that if a programmer uses realloci(), they're
> > explicitly expressing interest in minimizing realloc(3) calls, because
> > for some reason moving the contents is expensive.  So, it would be nice
> > if realloci() would be generous, by giving more size than asked for, and
> > telling the user the actual size.
> 
> True, but the same rationale applies to the first allocation with malloc() as 
> well.

You could immediately follow malloc(3) by realloci(), if you want this
behavior:

	void     *p;
	ssize_t  size = 1024;

	p = malloc(size);
	if (p == NULL)
		goto fail;

	size = realloci(p, size);
	if (size == -1)
		goto fail;

	// And now we know the actual size.

realloci() would be essentially a no-op, and it shouldn't fail, and
would be negligible compared to malloc(3).

> There's precedent for this: jemalloc provides nallocx() to calculate the block 
> ahead of time. Most implementations have one way or another of asking how big 
> the block really is, after the allocation.
>
> > I'll revise the specification as:
> > 
> >     Synopsis
> > 		ssize_t realloci(void *p, size_t size);
> 
> By the way, looks like this the same functionality as jemalloc's xallocx, 
> which
> 
>        The xallocx() function resizes the allocation at ptr in place to be at
>        least size bytes, and returns the real size of the allocation. If extra
>        is non-zero, an attempt is made to resize the allocation to be at least
>        (size + extra) bytes, though inability to allocate the extra byte(s)
>        will not by itself result in failure to resize.

Yup, it sounds like it.  I guess we can take that as prior art.  :)


Have a lovely night!
Alex

-- 
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.