Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <mpasvjei3x3b5tmvtn2yik6vh2gusziod2urwnsld2z3jzog3p@scz3r26u7czm>
Date: Fri, 20 Jun 2025 23:44:14 +0200
From: Alejandro Colomar <alx@...nel.org>
To: libc-alpha@...rceware.org
Cc: bug-gnulib@....org, musl@...ts.openwall.com, 
	наб <nabijaczleweli@...ijaczleweli.xyz>, Douglas McIlroy <douglas.mcilroy@...tmouth.edu>, 
	Paul Eggert <eggert@...ucla.edu>, Robert Seacord <rcseacord@...il.com>, 
	Elliott Hughes <enh@...gle.com>, Bruno Haible <bruno@...sp.org>, 
	JeanHeyd Meneide <phdofthehouse@...il.com>, Rich Felker <dalias@...c.org>, 
	Adhemerval Zanella Netto <adhemerval.zanella@...aro.org>, Joseph Myers <josmyers@...hat.com>, 
	Florian Weimer <fweimer@...hat.com>, Andreas Schwab <schwab@...e.de>, Thorsten Glaser <tg@...bsd.de>, 
	Eric Blake <eblake@...hat.com>, Vincent Lefevre <vincent@...c17.net>, 
	Mark Harris <mark.hsj@...il.com>, Collin Funk <collin.funk1@...il.com>, 
	Wilco Dijkstra <Wilco.Dijkstra@....com>, DJ Delorie <dj@...hat.com>, 
	Cristian Rodríguez <cristian@...riguez.im>, Siddhesh Poyarekar <siddhesh@...plt.org>, 
	Sam James <sam@...too.org>, Mark Wielaard <mark@...mp.org>, 
	"Maciej W. Rozycki" <macro@...hat.com>, Martin Uecker <ma.uecker@...il.com>, 
	Christopher Bazley <chris.bazley.wg14@...il.com>, eskil@...ession.se
Subject: Re: alx-0029r1 - Restore the traditional realloc(3) specification

[CC -= Laurent, since it bounces]

On Fri, Jun 20, 2025 at 11:26:55PM +0200, Alejandro Colomar wrote:
> Hi!
> 
> After the useful discussion with Eric and Paul, I've rewritten a draft
> of a proposal I had for realloc(3) for C2y.  Here it is (see below).
> 
> I'll present it here before presenting it to the C Committee (although
> several members are CCd).
> 
> This time, I opted for an all-in-one change that puts us in the end
> goal, since some people were concerned that step-by-step might be less
> feasible.  Also, the wording is more consistent doing this at once, and
> people know what to expect from the begining.
> 
> 
> Have a lovely day!
> Alex
> 
> ---
> Name
> 	alx-0029r1 - Restore the traditional realloc(3) specification
> 
> Principles
> 	-  Uphold the character of the language
> 	-  Keep the language small and simple
> 	-  Facilitate portability
> 	-  Avoid ambiguities
> 	-  Pay attention to performance
> 	-  Codify existing practice to address evident deficiencies.
> 	-  Avoid quiet changes
> 	-  Enable secure programming
> 
> Category
> 	Remove UB.
> 
> Author
> 	Alejandro Colomar <alx@...nel.org>
> 
> 	Cc: <bug-gnulib@....org>
> 	Cc: <musl@...ts.openwall.com>
> 	Cc: <libc-alpha@...rceware.org>
> 	Cc: наб <nabijaczleweli@...ijaczleweli.xyz>
> 	Cc: Douglas McIlroy <douglas.mcilroy@...tmouth.edu>
> 	Cc: Paul Eggert <eggert@...ucla.edu>
> 	Cc: Robert Seacord <rcseacord@...il.com>
> 	Cc: Elliott Hughes <enh@...gle.com>
> 	Cc: Bruno Haible <bruno@...sp.org>
> 	Cc: JeanHeyd Meneide <phdofthehouse@...il.com>
> 	Cc: Rich Felker <dalias@...c.org>
> 	Cc: Adhemerval Zanella Netto <adhemerval.zanella@...aro.org>
> 	Cc: Joseph Myers <josmyers@...hat.com>
> 	Cc: Florian Weimer <fweimer@...hat.com>
> 	Cc: Laurent Bercot <ska-dietlibc@...rnet.org>
> 	Cc: Andreas Schwab <schwab@...e.de>
> 	Cc: Thorsten Glaser <tg@...bsd.de>
> 	Cc: Eric Blake <eblake@...hat.com>
> 	Cc: Vincent Lefevre <vincent@...c17.net>
> 	Cc: Mark Harris <mark.hsj@...il.com>
> 	Cc: Collin Funk <collin.funk1@...il.com>
> 	Cc: Wilco Dijkstra <Wilco.Dijkstra@....com>
> 	Cc: DJ Delorie <dj@...hat.com>
> 	Cc: Cristian Rodríguez <cristian@...riguez.im>
> 	Cc: Siddhesh Poyarekar <siddhesh@...plt.org>
> 	Cc: Sam James <sam@...too.org>
> 	Cc: Mark Wielaard <mark@...mp.org>
> 	Cc: "Maciej W. Rozycki" <macro@...hat.com>
> 	Cc: Martin Uecker <ma.uecker@...il.com>
> 	Cc: Christopher Bazley <chris.bazley.wg14@...il.com>
> 	Cc: <eskil@...ession.se>
> 
> History
> 	<https://www.alejandro-colomar.es/src/alx/alx/wg14/alx-0029.git/>
> 
> 	r0 (2025-06-17):
> 	-  Initial draft.
> 
> 	r1 (2025-06-20):
> 	-  Full rewrite after the recent glibc discussion.
> 
> See also
> 	<https://nabijaczleweli.xyz/content/blogn_t/017-malloc0.html>
> 	<https://sourceware.org/pipermail/libc-alpha/1999-April/000956.html>
> 	<https://inbox.sourceware.org/libc-alpha/20241019014002.3684656-1-siddhesh@sourceware.org/T/#u>
> 	<https://inbox.sourceware.org/libc-alpha/qukfe5yxycbl5v7ooskvqdnm3au3orohbx4babfltegi47iyly@or6dgf7akeqv/T/#u>
> 	<https://github.com/bminor/glibc/commit/7c2b945e1fd64e0a5a4dbd6ae6592a7314dcd4b5>
> 	<https://www.austingroupbugs.net/view.php?id=400>
> 	<https://www.austingroupbugs.net/view.php?id=526>
> 	<https://www.austingroupbugs.net/view.php?id=688>
> 	<https://sourceware.org/bugzilla/show_bug.cgi?id=12547>
> 	<https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_400.htm>
> 	<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n868.htm>
> 	<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2438.htm>
> 	<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2464.pdf>
> 	<https://pubs.opengroup.org/onlinepubs/9699919799.2008edition/functions/realloc.html>
> 	<https://pubs.opengroup.org/onlinepubs/9699919799.2013edition/functions/realloc.html>
> 
> Description
> 	Let's start by quoting the author of realloc(3).
> 
> 	On 2024-10-18 05:30, Douglas McIlroy wrote:
> 	> The discussion has taken a turn that's astonishing to one who
> 	> doesn't know the inside details of real compilers.
> 	>
> 	> Regardless of the behavior of malloc(0), one expects this
> 	> theorem to hold:
> 	>
> 	>	Given that p = malloc(n) is not NULL,
> 	>	that 0<=m<=n,
> 	>	and that malloc(m) could in some circumstance
> 	>	return a non-null pointer,
> 	>	then realloc(p,m) will return a non-null pointer.
> 	>
> 	> REALLOC_ZERO_BYTES_FREES flies in the face of this rational
> 	> expectation about dynamic storage allocation.  A diabolical
> 	> invention.
> 	>
> 	> Doug
> 
> 	The specification of realloc(3) has been problematic since the
> 	very first standards, even before ISO C.  The wording has
> 	changed significantly, trying to forcedly permit implementations
> 	to return a null pointer when the requested size is zero.  This
> 	originated from the intent of banning zero-sized objects from
> 	the language in C89, but that never worked well in
> 	retrospective, as we can see from the fallout.
> 
> 	None of the specifications have been good, and C23 finally gave
> 	up and made it undefined behavior.
> 
> 	However, this doesn't need to be like that.  The traditional
> 	implementation of realloc(3), present in Unix V7, inherited by
> 	the BSDs, and currently available in range of systems, including
> 	musl libc, doesn't have any issues.
> 
> 	Code written for platforms returning a null can be migrated to
> 	platforms returning non-null, without significant issues.
> 
> 	There are two kinds of code that call realloc(p,0).  One
> 	hard-codes the 0, and is used as a replacement of free(p).  This
> 	code ignores the return value, since it's unimportant.  This
> 	code currently produces a leak of 0 bytes plus associated
> 	metadata on platforms such as musl libc, where it returns a
> 	non-null pointer.  However, assuming that there are programs
> 	written with the knowledge that they won't ever be run on such
> 	platforms, we should take care of that, and make sure they don't
> 	leak.  A way of accomplishing this would be to recommend
> 	implementations to issue a diagnostic when realloc(3) is called
> 	with a hardcoded zero.  This is only an informal recommendation
> 	made by this proposal, as this is a matter of QoI, and the
> 	standard shouldn't say anything about it.  This would prevent
> 	this class of minor leaks.
> 
> 	Moreover, in glibc, realloc(p,0) may return non-null, in the
> 	case where p is NULL, so code must already take that into
> 	account, and thus code that simply takes realloc(p,0) as a
> 	synonym of free(p) is already leaky, as free(NULL) is a no-op,
> 	but realloc(NULL,0) allocates 0 bytes.
> 
> 	The other kind of code is in algorithms that realloc(3) an
> 	arbitrary size, which might eventually be zero.  This gets more
> 	complex.
> 
> 	Here's the code that should be written for AIX or glibc:
> 
> 		errno = 0;
> 		new = realloc(old, size);
> 		if (new == NULL) {
> 			if (errno == ENOMEM)
> 				free(old);
> 			goto fail;
> 		}
> 		...
> 		free(new);
> 
> 	Failing to check for ENOMEM in these platforms before freeing
> 	the old pointer would result in a double-free.  If the program
> 	decides to continue using the old pointer instead of freeing it,
> 	it would result in a use-after-free.
> 
> 	In the platforms where realloc(p,0) returns non-null, such as
> 	the BSDs or musl libc, it is simpler to handle it:
> 
> 		new = realloc(old, size);
> 		if (new == NULL) {  // errno is ENOMEM
> 			free(old);
> 			goto fail;
> 		}
> 		...
> 		free(new);
> 
> 	Whenever the result is a null pointer, these platforms are
> 	reporting an ENOMEM error, and thus it is superfluous to check
> 	errno there.
> 
> 	Most code is written in this way, even if run on platforms
> 	returning a null pointer.  This is because most programmers are
> 	just unaware of this problem.
> 
> 	If the realloc(3) specification was changed to require that
> 	realloc(p,0) returns non-null on success, and that realloc(p,0)
> 	only fails when out-of-memory, and to require that it sets
> 	errno to ENOMEM, then code written for AIX or glibc would
> 	continue working just fine, since the errno check would be
> 	redundant with the null check.  Simply, the conditional
> 	(errno == ENOMEM) would always be true when (new == NULL).
> 
> 	This makes handling of realloc(3) as straightforward as one
> 	would expect, with only two states: success or error.
> 
> 	The resulting wording in the standard is also much simpler, as
> 	it doesn't need to define so many special cases.
> 
> 	For consistency, all the other allocation functions are updated
> 	to both return an .
> 
> Prior art
>     gnulib
> 	gnulib provides the realloc-posix module, which aims to wrap the
> 	system realloc(3) and reallocarray(3) functions so that they
> 	behave in a POSIX-complying manner.
> 
> 	It previously behaved like glibc.  After I reported that it was
> 	non-conforming to POSIX, we discussed the best way forward,
> 	which we agreed was the same direction that this paper is
> 	proposing now for C2y.  The implementation was changed in
> 
> 		gnulib.git d884e6fc4a60 (2024-11-04; "realloc-posix: realloc (..., 0) now returns nonnull")
> 
> 	There have been no regression reports since then, as we
> 	expected.
> 
>     Unix V7
> 	The proposed behavior is the one endorsed by Doug McIlroy, the
> 	author of the original implementation of realloc(3) in Unix V7,
> 	and also present in the BSDs.
> 
> Design decisions
> 	This change needs three changes, which can be applied both at
> 	once, or in two separate steps.
> 
> 	The first step would make realloc(p,s) be consistent with
> 	free(p) and malloc(s), including when p is a null pointer, when
> 	s is zero, and also when both corner cases happen at the same
> 	time.  This change would already turn the implementations where
> 	malloc(0) returns non-null into the end goal we have.
> 
> 	The first step would require changes to (at least) the following
> 	implementations: glibc, Bionic, Windows.
> 
> 	The second step would be to require that malloc(0) returns a
> 	non-null pointer.
> 
> 	The second step would require changes to (at least) the
> 	following implementations: AIX.
> 
> 	The third step would be to require that on error, errno is set
> 	to ENOMEM.
> 
> 	This proposal has merged all steps into a single proposal.
> 
> 	This proposal also needs to add ENOMEM to the standard, since it
> 	hasn't been standardized yet.
> 
> Future directions
> 	This proposal, by specifying realloc(3) as-if by calling
> 	free(3) and malloc(3), makes it redundant several mentions of
> 	realloc(3) next to either free(3) or malloc(3) in the standard.
> 	We could remove them in this proposal, or clean up that in a
> 	separate (mostly editorial) proposal.  Let's keep it for a
> 	future proposal for now.
> 
> Caveats
> 	Code written today should be careful, in case it can run on
> 	older systems that are not fixed to comply with this stricter
> 	specification.  Thus, code written today should call realloc(3)
> 	similar to this:
> 
> 		realloc(p, n?n:1);
> 
> 	When all existing implementations are fixed to comply with this
> 	stricter specification, that workaround can be removed.
> 
> Proposed wording
> 	Based on N3550.
> 
>     7.5  Errors <errno.h>
> 	## Add ENOMEM in p2.
> 
>     7.25.4.1  Memory management functions :: General
> 	@@ p1
> 	...
> 	 If the size of the space requested is zero,
> 	-the behavior is implementation-defined:
> 	-either
> 	-a null pointer is returned to indicate the error,
> 	-or
> 	 the behavior is as if the size were some nonzero value,
> 	 except that the returned pointer shall not be used
> 	 to access an object.
> 
>     7.25.4.2  The aligned_alloc function
> 	@@ Returns, p3
> 	 The <b>aligned_alloc</b> function returns
> 	-either
> 	-a null pointer
> 	-or
> 	-a pointer to the allocated space.
> 	+a pointer to the allocated space
> 	+on success.
> 	+If
> 	+the space cannot be allocated,
> 	+a null pointer is returned,
> 	+and the value of the macro <b>ENOMEM</b>
> 	+is stored in <b>errno</b>.
> 
>     7.25.4.3  The calloc function
> 	@@ Returns, p3
> 	 The <b>calloc</b> function returns
> 	-either
> 	 a pointer to the allocated space
> 	+on success.
> 	-or a null pointer
> 	-if
> 	+If
> 	 the space cannot be allocated
> 	 or if the product <tt>nmemb * size</tt>
> 	-would wraparound <b>size_t</b>.
> 	+would wraparound <b>size_t</b>,
> 	+a null pointer is returned,
> 	+and the value of the macro <b>ENOMEM</b>
> 	+is stored in <b>errno</b>.
> 
>     7.25.4.7  The malloc function
> 	@@ Returns, p3
> 	 The <b>malloc</b> function returns
> 	-either
> 	-a null pointer
> 	-or
> 	-a pointer to the allocated space.
> 	+a pointer to the allocated space
> 	+on success.
> 	+If
> 	+the space cannot be allocated,
> 	+a null pointer is returned,
> 	+and the value of the macro <b>ENOMEM</b>
> 	+is stored in <b>errno</b>.
> 
>     7.25.4.8  The realloc function
> 	@@ Description, p2
> 	 The <b>realloc</b> function
> 	 deallocates the old object pointed to by <tt>ptr</tt>
> 	+as if by a call to <b>free</b>,
> 	 and returns a pointer to a new object
> 	-that has the size specified by <tt>size</tt>.
> 	+that has the size specified by <tt>size</tt>
> 	+as if by a call to <b>malloc</b>.
> 	 The contents of the new object
> 	 shall be the same as that of the old object prior to deallocation,
> 	 up to the lesser of the new and old sizes.
> 	 Any bytes in the new object
> 	 beyond the size of the old object
> 	 have unspecified values.
> 
> 	@@ p3
> 	 If <tt>ptr</tt> is a null pointer,
> 	 the <b>realloc</b> function behaves
> 	 like the <b>malloc</b> function for the specified size.
> 	 Otherwise,
> 	 if <tt>ptr</tt> does not match a pointer
> 	 earlier returned by a memory management function,
> 	 or
> 	 if the space has been deallocated
> 	 by a call to the <b>free</b> or <b>realloc</b> function,
> 	-or
> 	-if the size is zero,
> 	## We're defining the behavior.
> 	 the behavior is undefined.
> 	 If
> 	-memory for the new object is not allocated,
> 	+the space cannot be allocated,
> 	## Editorial; for consistency with the wording of the other functions.
> 	 the old object is not deallocated
> 	 and its value is unchanged.
> 
> 	@@ Returns, p4
> 	 The <b>realloc</b> function returns
> 	 a pointer to the new object
> 	 (which can have the same value
> 	-as a pointer to the old object),
> 	+as a pointer to the old object)
> 	+on success.
> 	-or
> 	+If
> 	+space cannot be allocated,
> 	 a null pointer
> 	+is returned
> 	+and the value of the macro <b>ENOMEM</b>
> 	+is stored in <b>errno</b>.
> 
> -- 
> <https://www.alejandro-colomar.es/>



-- 
<https://www.alejandro-colomar.es/>

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.