Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <v3gaetujc5dagncnsr2phbws2py6t2ylqm6j6mjyqnpkpmg3pe@oxeqml2vcqu3>
Date: Sat, 21 Jun 2025 03:59:24 +0200
From: Alejandro Colomar <alx@...nel.org>
To: Christopher Bazley <chris.bazley.wg14@...il.com>
Cc: libc-alpha@...rceware.org, bug-gnulib@....org, musl@...ts.openwall.com, 
	наб <nabijaczleweli@...ijaczleweli.xyz>, Douglas McIlroy <douglas.mcilroy@...tmouth.edu>, 
	Paul Eggert <eggert@...ucla.edu>, Robert Seacord <rcseacord@...il.com>, 
	Elliott Hughes <enh@...gle.com>, Bruno Haible <bruno@...sp.org>, 
	JeanHeyd Meneide <phdofthehouse@...il.com>, Rich Felker <dalias@...c.org>, 
	Adhemerval Zanella Netto <adhemerval.zanella@...aro.org>, Joseph Myers <josmyers@...hat.com>, 
	Florian Weimer <fweimer@...hat.com>, Laurent Bercot <ska-dietlibc@...rnet.org>, 
	Andreas Schwab <schwab@...e.de>, Eric Blake <eblake@...hat.com>, 
	Vincent Lefevre <vincent@...c17.net>, Mark Harris <mark.hsj@...il.com>, 
	Collin Funk <collin.funk1@...il.com>, Wilco Dijkstra <Wilco.Dijkstra@....com>, 
	DJ Delorie <dj@...hat.com>, Cristian Rodríguez <cristian@...riguez.im>, 
	Siddhesh Poyarekar <siddhesh@...plt.org>, Sam James <sam@...too.org>, Mark Wielaard <mark@...mp.org>, 
	"Maciej W. Rozycki" <macro@...hat.com>, Martin Uecker <ma.uecker@...il.com>, eskil@...ession.se
Subject: Re: alx-0029r1 - Restore the traditional realloc(3) specification

Hi Chris,

On Fri, Jun 20, 2025 at 11:31:45PM +0100, Christopher Bazley wrote:
> Hi Alex,
> 
> On Fri, Jun 20, 2025 at 10:26 PM Alejandro Colomar <alx@...nel.org> wrote:
> >         There are two kinds of code that call realloc(p,0).  One
> >         hard-codes the 0, and is used as a replacement of free(p).  This
> >         code ignores the return value, since it's unimportant.  This
> >         code currently produces a leak of 0 bytes plus associated
> 
> I have a feeling that I wrote something like this in one of my emails
> but I have since realised that the as-if rule allows "deallocates the
> old object pointed to by ptr and returns a pointer to a new object
> that has the size specified by size" to be a no-op: an implementation
> of realloc could return a pointer to the same heap block whenever the
> new requested size is less than or equal to the current size.
> Effectively the amount of memory leaked would then be bounded only by
> the maximum size of the allocation.

Yes, that could be a valid implementation.  Although the implementation
would be a bit stupid if it didn't allow that memory to be reclaimed by
another call to malloc(3).  Especially with a call with size 0, where
none of the old contents are alive anymore, and so any pointer would
work.

> >         metadata on platforms such as musl libc, where it returns a
> >         non-null pointer.  However, assuming that there are programs
> >         written with the knowledge that they won't ever be run on such
> >         platforms, we should take care of that, and make sure they don't
> >         leak.  A way of accomplishing this would be to recommend
> >         implementations to issue a diagnostic when realloc(3) is called
> >         with a hardcoded zero.  This is only an informal recommendation
> >         made by this proposal, as this is a matter of QoI, and the
> >         standard shouldn't say anything about it.  This would prevent
> >         this class of minor leaks.
> >
> >         Moreover, in glibc, realloc(p,0) may return non-null, in the
> >         case where p is NULL, so code must already take that into
> >         account, and thus code that simply takes realloc(p,0) as a
> >         synonym of free(p) is already leaky, as free(NULL) is a no-op,
> >         but realloc(NULL,0) allocates 0 bytes.
> 
> This behaviour does not sound good, but I think you are assuming
> something about the usage of realloc(p,0): might it be called if and
> only if p != NULL?

Yeah, one could call

	if (p != NULL)
		realloc(p, 0);

I somehow expect people to not be that insane.  :)

> 
> >         The other kind of code is in algorithms that realloc(3) an
> >         arbitrary size, which might eventually be zero.  This gets more
> >         complex.
> >
> >         Here's the code that should be written for AIX or glibc:
> >
> >                 errno = 0;
> >                 new = realloc(old, size);
> >                 if (new == NULL) {
> >                         if (errno == ENOMEM)
> >                                 free(old);
> >                         goto fail;
> >                 }
> >                 ...
> >                 free(new);
> >
> >         Failing to check for ENOMEM in these platforms before freeing
> >         the old pointer would result in a double-free.  If the program
> >         decides to continue using the old pointer instead of freeing it,
> >         it would result in a use-after-free.
> 
> The above code looks suspect to me anyway because it does 'goto fail'
> in a scenario where errno != ENOMEM, which presumably includes errno
> == 0, which should not be considered a failure.

Let's quote C17 (not C23 because it's UB):

<https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf#subsection.7.22.3>
C17::7.22.3p1:

	If the size of the space requested is zero,
	the behavior is implementation-defined:
	either a null pointer is returned to indicate an error,
	or the behavior is as if the size were some nonzero value,
	except that the returned pointer shall not be used to access an object.

It clearly says that it is considered an error.  It's a partial error,
because the object is successfully deallocated, but the new memory is
not allocated.

POSIX.1-2024 is slightly different: it says the function may consider it
an error, by reporting EINVAL.  However, the implementation is free to
not set errno at all.

In any case, as a programmer, I'd consider it an error.  I don't want to
treat a NULL pointer as a valid pointer, because something's going to
break further down when I pass this pointer around and compare it to
NULL.  So I better fail already soon.  It's about null pointer hygiene;
you know what I'm talking about.  ;)

> 
> Anyway, isn't there a simpler example that illustrates your point
> without relying on errno?
> 
> What about this:
> 
>                  new = realloc(old, size);
>                  if (new == NULL) {
>                          if (size != 0) {
>                                  free(old);
>                                  goto fail;
>                          }
>                  }
>                  ...
>                  free(new);

This is pedantically not valid.  An implementation might ENOMEM on
size==0 (let's say you're on an implementation that returns non-null,
and it wasn't able to find enough space for the metadata needed for the
allocation; in such a case, the old pointer must not be freed, as in any
other ENOMEM situations).  It would be weird, but not impossible.  In
such a case, your code would leak 'old'.

So, the only way to call realloc(p,s) portably today to POSIX systems is
by checking ENOMEM.

> If you are absolutely sure that ENOMEM is required for this case then
> the argument that such implementations are compliant with the ISO C
> standard is weaker than I initially realised.

I think you need it.

> 
> >         In the platforms where realloc(p,0) returns non-null, such as
> >         the BSDs or musl libc, it is simpler to handle it:
> >
> >                 new = realloc(old, size);
> >                 if (new == NULL) {  // errno is ENOMEM
> >                         free(old);
> >                         goto fail;
> >                 }
> >                 ...
> >                 free(new);
> >
> >         Whenever the result is a null pointer, these platforms are
> >         reporting an ENOMEM error, and thus it is superfluous to check
> >         errno there.
> >
> >         Most code is written in this way, even if run on platforms
> >         returning a null pointer.  This is because most programmers are
> >         just unaware of this problem.
> 
> Also perhaps because ENOMEM isn't part of ISO standard C therefore it
> is not necessarily defined. (e.g., it is not defined in the headers
> for the Norcroft C compiler for RISC OS.)

They should define it now.  As for old code, if they didn't have ENOMEM
available, they didn't have a portable way to call realloc(3).  They
could use your version, but it could leak on platforms that don't return
NULL.  Or they could just use the realloc(p,n?n:1) trick, which is
simpler, and works like a charm.

> It's interesting to think about why ENOMEM might not be part of the
> ISO standard. I suspect the reason is that it was not believed to be
> necessary for completeness of any of the standard library interfaces.
> Your email suggests otherwise.

Indeed.  Actually, it wouldn't be necessariy if standards and
implementations hadn't broken realloc(3) so much.  In an ideal world,
NULL would be enough to know that realloc(3) failed.  But we're not in
that world.  Current code needs to check ENOMEM.  And thus, the new
specification should use ENOMEM, to support that code.  We need an
implementation that is fully backwards-compatible.  Otherwise, I'm
worried that we won't convince the implementations to risk introducing
silent bugs in code that works today.

> >         If the realloc(3) specification was changed to require that
> 
> "Were", not "was". Sorry for being pedantic.

Thanks!  I'm not a native English speaker; these corrections help.  :)

> >         realloc(p,0) returns non-null on success, and that realloc(p,0)
> >         only fails when out-of-memory, and to require that it sets
> >         errno to ENOMEM, then code written for AIX or glibc would
> 
> You can't require that errno be set to a value that does not exist in
> the standard. I see you are planning to add it, but I'm not yet
> convinced that is necessary.

See above.  I had previously thought that it wasn't necessary, but after
spending some time thinking about it today, I'm pretty sure we need it.

> >         continue working just fine, since the errno check would be
> >         redundant with the null check.  Simply, the conditional
> >         (errno == ENOMEM) would always be true when (new == NULL).
> >
> >         This makes handling of realloc(3) as straightforward as one
> >         would expect, with only two states: success or error.
> >
> >         The resulting wording in the standard is also much simpler, as
> >         it doesn't need to define so many special cases.
> >
> >         For consistency, all the other allocation functions are updated
> >         to both return an .
> 
> Missing text?

Yep.  I have it fixed for the next revision:

	diff --git i/alx-0029.txt w/alx-0029.txt
	index a1a96c4..f8835cf 100644
	--- i/alx-0029.txt
	+++ w/alx-0029.txt
	@@ -31,7 +31,6 @@ Author
		Cc: Adhemerval Zanella Netto <adhemerval.zanella@...aro.org>
		Cc: Joseph Myers <josmyers@...hat.com>
		Cc: Florian Weimer <fweimer@...hat.com>
	-       Cc: Laurent Bercot <ska-dietlibc@...rnet.org>
		Cc: Andreas Schwab <schwab@...e.de>
		Cc: Thorsten Glaser <tg@...bsd.de>
		Cc: Eric Blake <eblake@...hat.com>
	@@ -58,6 +57,10 @@ History
		r1 (2025-06-20):
		-  Full rewrite after the recent glibc discussion.
	 
	+       r2 ():
	+       -  Remove CC.
	+       -  wfix.
	+
	 See also
		<https://nabijaczleweli.xyz/content/blogn_t/017-malloc0.html>
		<https://sourceware.org/pipermail/libc-alpha/1999-April/000956.html>
	@@ -192,7 +195,7 @@ Description
		it doesn't need to define so many special cases.
	 
		For consistency, all the other allocation functions are updated
	-       to both return an .
	+       to both return a null pointer and set errno to ENOMEM.
	 
	 Prior art
	     gnulib

> 
> > Prior art
> >     gnulib
> >         gnulib provides the realloc-posix module, which aims to wrap the
> >         system realloc(3) and reallocarray(3) functions so that they
> >         behave in a POSIX-complying manner.
> >
> >         It previously behaved like glibc.  After I reported that it was
> >         non-conforming to POSIX, we discussed the best way forward,
> >         which we agreed was the same direction that this paper is
> >         proposing now for C2y.  The implementation was changed in
> >
> >                 gnulib.git d884e6fc4a60 (2024-11-04; "realloc-posix: realloc (..., 0) now returns nonnull")
> >
> >         There have been no regression reports since then, as we
> >         expected.
> >
> >     Unix V7
> >         The proposed behavior is the one endorsed by Doug McIlroy, the
> >         author of the original implementation of realloc(3) in Unix V7,
> >         and also present in the BSDs.
> >
> > Design decisions
> >         This change needs three changes, which can be applied both at
> >         once, or in two separate steps.
> >
> >         The first step would make realloc(p,s) be consistent with
> >         free(p) and malloc(s), including when p is a null pointer, when
> >         s is zero, and also when both corner cases happen at the same
> >         time.  This change would already turn the implementations where
> >         malloc(0) returns non-null into the end goal we have.
> >
> >         The first step would require changes to (at least) the following
> >         implementations: glibc, Bionic, Windows.
> >
> >         The second step would be to require that malloc(0) returns a
> >         non-null pointer.
> >
> >         The second step would require changes to (at least) the
> >         following implementations: AIX.
> >
> >         The third step would be to require that on error, errno is set
> >         to ENOMEM.
> >
> >         This proposal has merged all steps into a single proposal.
> >
> >         This proposal also needs to add ENOMEM to the standard, since it
> >         hasn't been standardized yet.
> 
> I think this change would be better served by a separate proposal
> unless you believe both:
> - that ENOMEM serves a special purpose for realloc, and
> - that WG14 can only be persuaded to accept ENOMEM on account of that
> special purpose.

For backwards compatiblity, I think we need to add ENOMEM.  Otherwise,
code that is perfect today might have small issues with the new
specification.

And of course, if we have issues with the new specification, it won't
even be accepted or implemented.

> I have doubts about both points.
> 
> > Future directions
> >         This proposal, by specifying realloc(3) as-if by calling
> >         free(3) and malloc(3), makes it redundant several mentions of
> 
> Extra 'it'.

Thanks!

> 
> Overall, I wholeheartedly support your direction.

Thanks!  :-)


Have a lovely day!
Alex

> 
> Chris

-- 
<https://www.alejandro-colomar.es/>

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.