musl - Re: Clever code in malloc()

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20170707200121.np5swdzq7imoqwuh@voyager>
Date: Fri, 7 Jul 2017 22:01:21 +0200
From: Markus Wichmann <nullplan@....net>
To: musl@...ts.openwall.com
Subject: Re: Clever code in malloc()

On Thu, Jul 06, 2017 at 07:10:25PM -0400, Rich Felker wrote:
> The interesting case is when size (the size of an existing free chunk)
> is large and a relatively small new chunk (new_size) is being merged
> into it. This is going to happen A LOT, and you don't want to make a
> syscall each time or free will be unusably slow. What the inequality
> expression above does is check whether the increase is crossing a
> power-of-two boundary, resulting in a sort of exponential backoff.
> 
> To see how it's working, consider the bits of the sum new_size+size.
> There will be a highest bit position at which the sum differs from
> size, and the xor operation will then clear all bits above that point
> (and muddle up everything below. The result can be greater than size
> if and only if the sum introduced at least one new bit place above the
> highest bit set in size. Otherwise, the xor cleared at least one
> leading bit of size without introducing a new higher one, resulting in
> a smaller value.
> 

Thank you for the explanation. But does that not mean that an
application could trigger the madvise() at every free() call if the only
available chunk just exceeds a power of two in size, and the application
keeps allocating and then freeing enough to put the chunk under that
size?

Let's take a statically linked application on a thirty-two bit system
(statically linked, so the dynamic memory reclamation doesn't get in the
way of my example, and thirty-two bits because the header size is
machine word dependent, and I need to choose something). Let's further
assume that brk() works.

The application allocates twenty bytes. adjust_size() will increase this
size by two machine words, then round up to the next four-word boundary.
So this comes out at thirty-two bytes.

Since nothing is allocated so far, a heap expansion is triggered. A page
is allocated. The initial chunk in that page will be four thousand
eighty-eight bytes to make room for the footer. The requested thirty-two
byte chunk will be split off, leaving the first heap page with a
thirty-two byte chunk allocated and a four thousand fifty-six byte chunk
freed.

The application then requests four thousand and fourty bytes. After
adjustment this comes out at four thousand and fifty-six, so the last
free chunk is allocated.

The application the requests another ten bytes (so thirty-two after
adjustment) and then allocates the rest of that second page, and will
never free it. Then it frees the four thousand and fifty-six byte chunk
and the second thirty-two byte chunk. Since adjacent chunks are
coalesced, we now have a free chunk of four thousand and eighty-eight
bytes in size. If the application now frees its initial thirty-two byte
chunk, the chunk size puts the free chunk size over the edge of the four
thousand ninety-six boundary, so the expression triggers.

If the application now continues to allocate and free twenty bytes, the
expression will trigger at every free().

Well, OK, it won't, but only because I ignored the minimum size
requirement of 160k.

The example is admittedly contrived, but many more are thinkable.

> Rich

Ciao,
Markus
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.