Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 19 Sep 2022 22:15:12 -0400
From: Rich Felker <>
To: baiyang <>
Cc: musl <>
Subject: Re: Re: The heap memory performance (malloc/free/realloc) is
 significantly degraded in musl 1.2 (compared to 1.1)

On Tue, Sep 20, 2022 at 09:18:04AM +0800, baiyang wrote:
> > There is no hidden "size actually allocated internally". The size you
> > get is the size you requested. Everything else is allocator data
> > structures *outside of the object* that the caller has no entitlement
> > to peek or poke at, and malloc_usable_size's return value reflects
> > that.
> If I understand correctly, according to the definition of size_classes in the mallocng code: 
> 1. When I call `void* p = malloc(6600)`, mallocng actually allocates
> more than 8100 bytes of usable space, right?

No, it uses space from a size-class-8176 group (~=slab) to produce an
allocation of size 6600. The *allocation* is the part that belongs to
the caller. Everything else is part of the allocator data structures.

> 2. According to your previous explanation, calling
> malloc_usable_size(p) at this time returns 6600, right?


> My question is, if malloc_usable_size(p) can directly return 8191
> (or similar actual allocated size, as other libc do) instead of
> 6600, is it possible to make mallocng achieve higher performance
> both in time and space?

No, and the reason you said you want it to does not make sense. You
seem to think that if the group stride was 8100, calling realloc might
memcpy up to 8100 bytes. This is not the case. If realloc has to
allocate a new object, the amount copied will be 6600 or exactly
whatever the allocated object size was (or the new size, if smaller).
This is the only meaningful number.

You also seem to be under the impression that the work to determine
that the size was 6600 and not 8100 is where most (or at least a
significant portion of) the time is spent. This is also not the case.
The majority of the metadata processing time is chasing pointers back
to the out-of-band metadata, validating it, validating that it
round-trips back, and validating various other things. Some of these
could in principle be omitted at the cost of loss-of-hardening.

Figuring out that the allocation is 6600 bytes, once you already know
the size class and out-of-band metadata, is quite trivial and hardly
takes any of the time. (It also has a few validation checks that could
be omitted at the cost of loss of hardening, but these are
proportionally much smaller.)


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.