![]() |
|
Message-ID: <kwmfqchzngw4z4jjdougougxsyuyf4gnydusfu56ovxxyonr2c@r6qgbf34tukh>
Date: Mon, 30 Jun 2025 04:27:52 +0200
From: Alejandro Colomar <alx@...nel.org>
To: libc-alpha@...rceware.org
Cc: bug-gnulib@....org, musl@...ts.openwall.com,
наб <nabijaczleweli@...ijaczleweli.xyz>, Douglas McIlroy <douglas.mcilroy@...tmouth.edu>,
Paul Eggert <eggert@...ucla.edu>, Robert Seacord <rcseacord@...il.com>,
Elliott Hughes <enh@...gle.com>, Bruno Haible <bruno@...sp.org>,
JeanHeyd Meneide <phdofthehouse@...il.com>, Rich Felker <dalias@...c.org>,
Adhemerval Zanella Netto <adhemerval.zanella@...aro.org>, Joseph Myers <josmyers@...hat.com>,
Florian Weimer <fweimer@...hat.com>, Laurent Bercot <ska-dietlibc@...rnet.org>,
Andreas Schwab <schwab@...e.de>, Thorsten Glaser <tg@...bsd.de>, Eric Blake <eblake@...hat.com>,
Vincent Lefevre <vincent@...c17.net>, Mark Harris <mark.hsj@...il.com>,
Collin Funk <collin.funk1@...il.com>, Wilco Dijkstra <Wilco.Dijkstra@....com>,
DJ Delorie <dj@...hat.com>, Cristian Rodríguez <cristian@...riguez.im>,
Siddhesh Poyarekar <siddhesh@...plt.org>, Sam James <sam@...too.org>, Mark Wielaard <mark@...mp.org>,
"Maciej W. Rozycki" <macro@...hat.com>, Martin Uecker <ma.uecker@...il.com>,
Christopher Bazley <chris.bazley.wg14@...il.com>, eskil@...ession.se,
Daniel Krügler <daniel.kruegler@...glemail.com>, Kees Cook <keescook@...omium.org>,
Valdis Klētnieks <valdis.kletnieks@...edu>
Subject: Re: alx-0029r6 - Restore the traditional realloc(3) specification
Hi all,
This paper is now submitted to the C Commitee:
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3621.txt>
Have a lovely day!
Alex
On Fri, Jun 27, 2025 at 04:01:54PM +0200, Alejandro Colomar wrote:
> Hi!
>
> Here's a new revision of the proposal, addressing some points raised by
> Mark, plus clarifying that the paragraph about when size is zero refers
> to the total size, as Florian was concerned that it might not be
> symmetric.
>
>
> Have a lovely day!
> Alex
>
> ---
> Name
> alx-0029r6 - Restore the traditional realloc(3) specification
>
> Principles
> - Uphold the character of the language
> - Keep the language small and simple
> - Facilitate portability
> - Avoid ambiguities
> - Pay attention to performance
> - Codify existing practice to address evident deficiencies.
> - Do not prefer any implementation over others
> - Ease migration to newer language editions
> - Avoid quiet changes
> - Enable secure programming
>
> Category
> Remove UB.
>
> Author
> Alejandro Colomar <alx@...nel.org>
>
> Cc: <bug-gnulib@....org>
> Cc: <musl@...ts.openwall.com>
> Cc: <libc-alpha@...rceware.org>
> Cc: наб <nabijaczleweli@...ijaczleweli.xyz>
> Cc: Douglas McIlroy <douglas.mcilroy@...tmouth.edu>
> Cc: Paul Eggert <eggert@...ucla.edu>
> Cc: Robert Seacord <rcseacord@...il.com>
> Cc: Elliott Hughes <enh@...gle.com>
> Cc: Bruno Haible <bruno@...sp.org>
> Cc: JeanHeyd Meneide <phdofthehouse@...il.com>
> Cc: Rich Felker <dalias@...c.org>
> Cc: Adhemerval Zanella Netto <adhemerval.zanella@...aro.org>
> Cc: Joseph Myers <josmyers@...hat.com>
> Cc: Florian Weimer <fweimer@...hat.com>
> Cc: Andreas Schwab <schwab@...e.de>
> Cc: Thorsten Glaser <tg@...bsd.de>
> Cc: Eric Blake <eblake@...hat.com>
> Cc: Vincent Lefevre <vincent@...c17.net>
> Cc: Mark Harris <mark.hsj@...il.com>
> Cc: Collin Funk <collin.funk1@...il.com>
> Cc: Wilco Dijkstra <Wilco.Dijkstra@....com>
> Cc: DJ Delorie <dj@...hat.com>
> Cc: Cristian Rodríguez <cristian@...riguez.im>
> Cc: Siddhesh Poyarekar <siddhesh@...plt.org>
> Cc: Sam James <sam@...too.org>
> Cc: Mark Wielaard <mark@...mp.org>
> Cc: "Maciej W. Rozycki" <macro@...hat.com>
> Cc: Martin Uecker <ma.uecker@...il.com>
> Cc: Christopher Bazley <chris.bazley.wg14@...il.com>
> Cc: <eskil@...ession.se>
> Cc: Daniel Krügler <daniel.kruegler@...glemail.com>
> Cc: Kees Cook <keescook@...omium.org>
> Cc: Valdis Klētnieks <valdis.kletnieks@...edu>
>
> History
> <https://www.alejandro-colomar.es/src/alx/alx/wg14/alx-0029.git/>
>
> r0 (2025-06-17):
> - Initial draft.
>
> r1 (2025-06-20):
> - Full rewrite after the recent glibc discussion.
>
> r2 (2025-06-21):
> - Remove CC. Add CC.
> - wfix.
> - Drop quote.
> - Add a few more principles
> - Clarify why ENOMEM is used in this proposal, and make it
> optional.
> - Mention exceptional leak in code checking (size != 0).
> - Clarify that part of the description of realloc can be
> editorially removed after this change.
>
> r3 (2025-06-23):
> - Fix diff missing line.
> - Remove ENOMEM from the proposal.
> - Clarify that ENOMEM should be retained by platforms already
> using it.
> - Add mention that LLVM's address sanitizer will catch the leak
> mentioned in r2.
> - Add links to real bugs (including an RCE bug).
>
> r4 (2025-06-24):
> - Use a better link for the Whatsapp RCE.
> - s/Description/Rationale/
> - wfix
> - Mention that glibc <2.1.1 had the BSD behavior.
> - Add footnote that realloc(3) may fail while shrinking.
>
> r5 (2025-06-26):
> - It was glibc 2.1.1 that broke it, not glibc 2.2.
> - wfix
> - Mention in the footnote that the pointer may change.
> - Document why not go the other way around. It was explained
> several times during discussion, but people keep suggesting
> it.
>
> r6 (2025-06-27):
> - Clarify that the paragraph about what happens when the size
> is zero refers to when the total size is zero (for calloc(3)
> that is nmemb*size).
> - s/Unix V7/V7 Unix/
> - tfix.
> - wfix.
>
> See also
> <https://nabijaczleweli.xyz/content/blogn_t/017-malloc0.html>
> <https://sourceware.org/pipermail/libc-alpha/1999-April/000956.html>
> <https://inbox.sourceware.org/libc-alpha/20241019014002.3684656-1-siddhesh@sourceware.org/T/#u>
> <https://inbox.sourceware.org/libc-alpha/qukfe5yxycbl5v7ooskvqdnm3au3orohbx4babfltegi47iyly@or6dgf7akeqv/T/#u>
> <https://github.com/bminor/glibc/commit/7c2b945e1fd64e0a5a4dbd6ae6592a7314dcd4b5>
> <https://github.com/llvm/llvm-project/issues/113065>
> <https://www.austingroupbugs.net/view.php?id=400>
> <https://www.austingroupbugs.net/view.php?id=526>
> <https://www.austingroupbugs.net/view.php?id=688>
> <https://sourceware.org/bugzilla/show_bug.cgi?id=12547>
> <https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_400.htm>
> <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n868.htm>
> <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2438.htm>
> <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2464.pdf>
> <https://pubs.opengroup.org/onlinepubs/9699919799.2008edition/functions/realloc.html>
> <https://pubs.opengroup.org/onlinepubs/9699919799.2013edition/functions/realloc.html>
> <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120744>
> <https://lore.kernel.org/lkml/20220213182443.4037039-1-keescook@chromium.org/>
> <https://awakened1712.github.io/hacking/hacking-whatsapp-gif-rce/>
> <https://gbhackers.com/whatsapp-double-free-vulnerability/>
>
> Rationale
> The specification of realloc(3) has been problematic since the
> very first standards, even before ISO C. The wording has
> changed significantly, trying to forcedly permit implementations
> to return a null pointer when the requested size is zero. This
> originated from the intent of banning zero-sized objects from
> the language in C89, but that never worked well in
> retrospective, as we can see from the fallout.
>
> None of the specifications have been good, and C23 finally gave
> up and made it undefined behavior.
>
> The problem is not only theoretical. Programmers don't know how
> to use realloc(3) correctly, and have written weird code in
> their attempts. This has resulted in a lot of non-sensical code
> in configure scripts[1], and even bugs in actual programs[2].
>
> [1] <https://codesearch.debian.net/search?q=%5Cbrealloc%5B+%5Ct%5D*%5B%28%5D%5B%5E%2C%5D*%2C%5B+%5Ct%5D0%5B%29%5D&literal=0>
> [2] <https://lore.kernel.org/lkml/20220213182443.4037039-1-keescook@chromium.org/>
>
> In some cases, this non-sensical code has resulted in RCEs[3].
>
> [3] <https://awakened1712.github.io/hacking/hacking-whatsapp-gif-rce/>
>
> However, this doesn't need to be like that. The traditional
> implementation of realloc(3), present in V7 Unix, inherited by
> the BSDs, and currently available in a range of systems,
> including musl libc, doesn't have any issues regarding zero-size
> allocations. glibc --which uses an independent implementation
> rather than a Unix derivative-- also had this behavior
> originally; it changed to the current behavior in 1999
> (glibc 2.1.1), only for compatibility with C89, even though
> ironically C99 was released soon after and removed the text that
> glibc was trying to comply with, and introduced some new text
> that was very confusing, and one of its interpretations would
> make the new glibc behavior non-conforming.
>
> Code written for platforms returning a null pointer can be
> migrated to platforms returning non-null, without significant
> issues.
>
> There are two kinds of code that call realloc(p,0). One
> hard-codes the 0, and is used as a replacement of free(p). This
> code ignores the return value, since it's unimportant. This
> code currently produces a leak of 0 bytes plus associated
> metadata on platforms such as musl libc, where it returns a
> non-null pointer. However, assuming that there are programs
> written with the knowledge that they won't ever be run on such
> platforms, we should take care of that, and make sure they don't
> leak. A way of accomplishing this would be to recommend
> implementations to issue a diagnostic when realloc(3) is called
> with a hardcoded zero. This is only an informal recommendation
> made by this proposal, as this is a matter of QoI, and the
> standard shouldn't say anything about it. This would prevent
> this class of minor leaks.
>
> Moreover, in glibc, realloc(p,0) may return non-null, in the
> case where p is NULL, so code must already take that into
> account, and thus code that simply takes realloc(p,0) as a
> synonym of free(p) is already leaky, as free(NULL) is a no-op,
> but realloc(NULL,0) allocates 0 bytes.
>
> The other kind of code is in algorithms that realloc(3) an
> arbitrary size, which might eventually be zero. This gets more
> complex.
>
> Here's the code that should be written for AIX or glibc:
>
> errno = 0;
> new = realloc(old, size);
> if (new == NULL) {
> if (errno == ENOMEM)
> free(old);
> goto fail;
> }
> ...
> free(new);
>
> Failing to check for ENOMEM in these platforms before freeing
> the old pointer would result in a double-free. If the program
> decides to continue using the old pointer instead of freeing it,
> it would result in a use-after-free.
>
> In the platforms where realloc(p,0) returns non-null, such as
> the BSDs or musl libc, it is simpler to handle it:
>
> new = realloc(old, size);
> if (new == NULL) { // errno is ENOMEM
> free(old);
> goto fail;
> }
> ...
> free(new);
>
> Whenever the result is a null pointer, these platforms are
> reporting an ENOMEM error, and thus it is superfluous to check
> errno there.
>
> Most code is written in this way, even if run on platforms
> returning a null pointer. This is because most programmers are
> just unaware of this problem. Part of the reason is also that
> returning a non-null pointer with zero bytes is the natural
> extension of the behavior, which is what programmers intuitively
> expect from libc; that is, if realloc(p,3) allocates 3 bytes,
> r(p,2) allocates two bytes, and r(p,1) allocates one byte, it is
> natural by induction to expect that r(p,0) will allocate zero
> bytes. Most algorithms naturally extend to 0 just fine, and
> special casing 0 is artificial.
>
> If the realloc(3) specification were changed to require that
> realloc(p,0) returns non-null on success, and that realloc(p,0)
> only fails when out-of-memory (and assuming the implementations
> will continue setting errno to ENOMEM), then code written for
> AIX or glibc would continue working just fine, since the errno
> check would be redundant with the null check. Simply, the
> conditional (errno == ENOMEM) would always be true when
> (new == NULL).
>
> Then, there are non-POSIX platforms that don't set ENOMEM. In
> those platforms, code might do this:
>
> new = realloc(old, size);
> if (new == NULL) {
> if (size != 0)
> free(old);
> goto fail;
> }
> ...
> free(new);
>
> That code would continue working with this proposal, except for
> a very rare corner case, in which it would leak. In the normal
> case, (size != 0) would never be true under (new == NULL),
> because a reallocation of 0 bytes would almost always succeed,
> and thus not return a null pointer under this proposal.
> However, in some cases, the system might not find space even for
> the small metadata needed for a 0-byte allocation. In such
> case, the (size != 0) conditional would prevent deallocating
> 'old', and thus cause a memory leak. This case is exceptional
> enough that it shouldn't stop us from fixing realloc(3).
> Anyway, on an out-of-memory case, the program is likely to
> terminate rather soon, so the issue is even less likely to have
> an impact on any existing programs. Also, LLVM's address
> sanitizer will soon able to catch such a leak:
> <https://github.com/llvm/llvm-project/issues/113065>
>
> This proposal makes handling of realloc(3) as straightforward as
> one would expect, with only two states: success or error. There
> are no in-between states.
>
> The resulting wording in the standard is also much simpler, as
> it doesn't need to define so many special cases.
>
> For consistency, all the other allocation functions are updated
> to both return a null pointer on error, and use consistent
> wording.
>
> Why not go the other way around?
> Some people keep asking why not go the other way around: why not
> force the BSDs and musl to return a null pointer if size is 0.
> This would result in double-free and use-after-free bugs, which
> can result in RCE vulnerabilities (remote code execution), which
> is clearly unacceptable.
>
> Consider this code, which is the usual code for calling
> realloc(3) in such systems:
>
> new = realloc(old, size);
> if (new == NULL) {
> free(old);
> goto fail;
> }
> ...
> free(new);
>
> If realloc(p,0) would return a null pointer and free the old
> block, then the third line would be a double-free bug.
>
> Prior art
> gnulib
> gnulib provides the realloc-posix module, which aims to wrap the
> system realloc(3) and reallocarray(3) functions so that they
> behave in a POSIX-complying manner.
>
> It previously behaved like glibc. After I reported that it was
> non-conforming to POSIX, we discussed the best way forward,
> which we agreed was the same direction that this paper is
> proposing now for C2y. The implementation was changed in
>
> gnulib.git d884e6fc4a60 (2024-11-04; "realloc-posix: realloc (..., 0) now returns nonnull")
>
> There have been no regression reports since then, as we
> expected.
>
> V7 Unix, BSD
> The proposed behavior is the one endorsed by Doug McIlroy, the
> author of the original implementation of realloc(3) in V7 Unix,
> and also present in the BSDs.
>
> glibc <= 2.1
> glibc was implemented originally to return non-null. It was
> only in 1999, and purely to comply with the standards --with no
> requests by users to do so--, that the glibc maintainers decided
> to switch to the current behavior.
>
> Design decisions
> This change needs two changes, which can be applied all at once,
> or in separate steps.
>
> The first step would make realloc(p,s) be consistent with
> free(p) and malloc(s), including when p is a null pointer, when
> s is zero, and also when both corner cases happen at the same
> time. This change would already turn the implementations where
> malloc(0) returns non-null into the end goal we have. This
> would require changes to (at least) the following
> implementations: glibc, Bionic, Windows.
>
> The second step would be to require that malloc(0) returns a
> non-null pointer. This would require changes to (at least) the
> following implementations: AIX.
>
> This proposal has merged all steps into a single proposal.
>
> Future directions
> This proposal, by specifying realloc(3) as-if by calling
> free(3) and malloc(3), makes redundant several mentions of
> realloc(3) next to either free(3) or malloc(3) in the standard.
> We could remove them in this proposal, or clean up that in a
> separate (mostly editorial) proposal. Let's keep it for a
> future proposal for now.
>
> Caveats
> n?n:1
> Code written today should be careful, in case it can run on
> older systems that are not fixed to comply with this stricter
> specification. Thus, code written today should call realloc(3)
> similar to this:
>
> realloc(p, n?n:1);
>
> When all existing implementations are fixed to comply with this
> stricter specification, that workaround can be removed.
>
> ENOMEM
> Existing implementations that set errno to ENOMEM must continue
> doing so when the input pointer is not freed. If they didn't,
> code that is currently portable to all POSIX systems
>
> errno = 0;
> new = realloc(old, size);
> if (new == NULL) {
> if (errno == ENOMEM)
> free(old);
> goto fail;
> }
> ...
> free(new);
>
> would leak on error.
>
> Since it is currently impossible to write code today that is
> portable to arbitrary C17 systems, this is not an issue in
> ISO C.
>
> - New code written for C2y will only need to check for
> NULL to detect errors.
>
> - Code written for specific C17 and older platforms
> that don't set errno will continue to work for those
> specific platforms.
>
> - Code written for POSIX.1-2024 and older platforms
> will continue working on POSIX C2y platforms,
> assuming that POSIX will continue mandating ENOMEM.
>
> - Code written for POSIX.1-2024 and older will not be
> able to be run on non-POSIX C2y platforms, but that
> could be expected.
>
> The only important thing is that platforms that did set ENOMEM
> should continue setting it, to avoid introducing leaks.
>
> Proposed wording
> Based on N3550.
>
> 7.25.4.1 Memory management functions :: General
> @@ p1
> ...
> -If the size of the space requested is zero,
> +If the total size of the space requested is zero,
> -the behavior is implementation-defined:
> -either
> -a null pointer is returned to indicate the error,
> -or
> the behavior is as if the size were some nonzero value,
> except that the returned pointer shall not be used
> to access an object.
>
> 7.25.4.2 The aligned_alloc function
> @@ Returns, p3
> The <b>aligned_alloc</b> function returns
> -either
> -a null pointer
> -or
> -a pointer to the allocated space.
> +a pointer to the allocated space
> +on success.
> +If
> +the space cannot be allocated,
> +a null pointer is returned.
>
> 7.25.4.3 The calloc function
> @@ Returns, p3
> The <b>calloc</b> function returns
> -either
> a pointer to the allocated space
> +on success.
> -or a null pointer
> -if
> +If
> the space cannot be allocated
> or if the product <tt>nmemb * size</tt>
> -would wraparound <b>size_t</b>.
> +would wraparound <b>size_t</b>,
> +a null pointer is returned.
>
> 7.25.4.7 The malloc function
> @@ Returns, p3
> The <b>malloc</b> function returns
> -either
> -a null pointer
> -or
> -a pointer to the allocated space.
> +a pointer to the allocated space
> +on success.
> +If
> +the space cannot be allocated,
> +a null pointer is returned.
>
> 7.25.4.8 The realloc function
> @@ Description, p2
> The <b>realloc</b> function
> deallocates the old object pointed to by <tt>ptr</tt>
> +as if by a call to <b>free</b>,
> and returns a pointer to a new object
> -that has the size specified by <tt>size</tt>.
> +that has the size specified by <tt>size</tt>
> +as if by a call to <b>malloc</b>.
> The contents of the new object
> shall be the same as that of the old object prior to deallocation,
> up to the lesser of the new and old sizes.
> Any bytes in the new object
> beyond the size of the old object
> have unspecified values.
>
> @@ p3
> If <tt>ptr</tt> is a null pointer,
> the <b>realloc</b> function behaves
> like the <b>malloc</b> function for the specified size.
> Otherwise,
> if <tt>ptr</tt> does not match a pointer
> earlier returned by a memory management function,
> or
> if the space has been deallocated
> by a call to the <b>free</b> or <b>realloc</b> function,
> ## We can probably remove all of the above, because of the
> ## behavior now being defined as-if by calls to malloc(3) and
> ## free(3). But let's do that editorially in a separate change.
> -or
> -if the size is zero,
> ## We're defining the behavior.
> the behavior is undefined.
> If
> -memory for the new object is not allocated,
> +the space cannot be allocated,
> ## Editorial; for consistency with the wording of the other functions.
> the old object is not deallocated
> and its value is unchanged.
> +XXX)
>
> @@ New footnote XXX
> +XXX)
> +While atypical,
> +<b>realloc</b> may fail
> +or return a different pointer
> +for a call that shrinks the block of memory.
>
> @@ Returns, p4
> The <b>realloc</b> function returns
> a pointer to the new object
> (which can have the same value
> -as a pointer to the old object),
> +as a pointer to the old object)
> +on success.
> -or
> +If
> +space cannot be allocated,
> a null pointer
> -if the new object has not been allocated.
> +is returned.
>
> --
> <https://www.alejandro-colomar.es/>
--
<https://www.alejandro-colomar.es/>
Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.