Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAA2zVHpo7_uFozK1PzK3igmtxCEAhsB4=J8L_aRibfk_Adg1zQ@mail.gmail.com>
Date: Wed, 12 Nov 2025 13:35:18 -0500
From: James Y Knight <jyknight@...gle.com>
To: Rich Felker <dalias@...c.org>
Cc: Thorsten Glaser <tg@...lvis.org>, Alejandro Colomar <alx@...nel.org>, musl@...ts.openwall.com, 
	The 8472 <the8472.rs@...inite-source.de>, Thiago Macieira <thiago@...ieira.org>, 
	Florian Weimer <fw@...eb.enyo.de>, libc-alpha@...rceware.org, 
	"Arthur O'Dwyer" <arthur.j.odwyer@...il.com>, Jonathan Wakely <jwakely@...hat.com>
Subject: Re: Re: realloci(): A realloc() variant that works in-place

On Tue, Nov 11, 2025 at 9:04 PM Rich Felker <dalias@...c.org> wrote:
> > Just to take a concrete example, GNU libstdc++'s std::string allocates
> > the exact size requested, initially. So, appending a char will always
> > initially grow the container, by doubling the size
>
> But in order to utilize a new interface, libstdc++ would have to be
> modified to do so, and there would be all sorts of conditional support
> and namespacing issues to deal with.
>
> If you're stuck modifying libstdc++ anyway, you might as well just
> make it use a better strategy than initially using exact size, then
> doubling. This works everywhere and does not require any new contracts
> or feature detection.

The "better" strategy is pretty dependent on the malloc
implementation's internal implementation details.

Many strings are initialized with particular data and then never
grown, so you want to avoid reserving more memory for the initial
allocation -- but enable using the remainder of the already-reserved
space if there is any.

We can look at llvm libc++'s std::string implementation as an example.
It was initially written to round up its requested allocations to a
multiple of 16, exactly to avoid wasting reserved space. Presumably
the malloc implementation the author used at the time it was written
was rounding internally to multiples of 16 bytes. But of course, other
allocators have different behaviors. E.g. on glibc, rounding to 16 is
nearly _pessimal_, as glibc malloc only allocates buckets of size
24+16*n (so, 24, 40, 56, 72, 88, etc). So on glibc, requesting sizes
that are multiples of 16-byte guarantees 8 wasted bytes in every
allocation. And every other allocator of course has its own unique
behaviors.

So now, libc++ rounds allocation requests to 8 bytes instead. This
avoids pessimizing commonly-used allocators which aren't rounding to
multiples of 16, and still preserves the ability to use some of the
slack space -- with the assumption that allocators are likely to at
least round sizes to a multiple of 8. But, that's still a compromise
which leaves memory on the table for any allocator which has
size-classes which are spaced larger than 8 bytes apart (which is most
of them).

Of course, libc++ could attempt to hardcode the allocation bucket
strategy of each libc, e.g. on glibc, always request 24+16*n
allocations. That would be effective as long as the implementations
match -- which cannot be guaranteed as the behavior of libc malloc
could change, or the user could replace the allocator. C++ allocators
are explicitly user-replaceable, and C malloc is typically
user-replaceable as an implementation extension. So, it's
significantly better to use an API which has the allocator directly
provide the required information than to try to guess a heuristic. So
you could call the nonstandard API, `size_t nallocx(size_t)`, which
returns the size which would be reserved for a given requested size.
That can be useful, but is suboptimal, as described by
http://wg21.link/p0901r11#nallocx

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.