musl - Re: Wrong formatting of doubles?

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250701163758.GD1827@brightrain.aerifal.cx>
Date: Tue, 1 Jul 2025 12:37:58 -0400
From: Rich Felker <dalias@...c.org>
To: Michal Terepeta <michalt@...gle.com>
Cc: musl@...ts.openwall.com, t5y-external <t5y-external@...gle.com>
Subject: Re: Wrong formatting of doubles?

On Tue, Jul 01, 2025 at 12:19:57PM -0400, Rich Felker wrote:
> On Tue, Jul 01, 2025 at 10:22:25AM -0400, Rich Felker wrote:
> > On Tue, Jul 01, 2025 at 07:55:20AM +0200, Michal Terepeta wrote:
> > > Hi,
> > > 
> > > We're working on a RISC-V system and using musl as our libc. A recent change
> > > in musl seems to have caused wrong formatting (printf) of double values:
> > > https://git.musl-libc.org/cgit/musl/commit/src/stdio/vfprintf.c?id=572a2e2eb91f00f2f25d301cfb50f435e7ae16b3
> > > 
> > > Using a simple binary to print max double [1] with the current code I get:
> > > 
> > > ```
> > > The maximum value of a double (printf %e): 1.348676e+308
> > > The maximum value of a double (printf %E): 1.348676E+308
> > > The maximum value of a double (printf %g): 3.13486e+308
> > > ```
> > > 
> > > With the patch [2] that reverts most of the above change, I get the expected
> > > output:
> > > 
> > > ```
> > > The maximum value of a double (printf %e): 1.797693e+308
> > > The maximum value of a double (printf %E): 1.797693E+308
> > > The maximum value of a double (printf %g): 1.79769e+308
> > > ```
> > > 
> > > It'd be great if someone could take a look if they can also repro and perhaps
> > > revert the change?
> > > 
> > > Many thanks!
> > > 
> > > Michal
> > > 
> > > 
> > > 
> > > [1] Repro program:
> > > 
> > > ```
> > > #include <float.h>
> > > #include <stdio.h>
> > > 
> > > int main() {
> > >   printf("The maximum value of a double (printf %%e): %e\n", DBL_MAX);
> > >   printf("The maximum value of a double (printf %%E): %E\n", DBL_MAX);
> > >   printf("The maximum value of a double (printf %%g): %g\n", DBL_MAX);
> > >   return 0;
> > > }
> > > 
> > > ```
> > > 
> > > 
> > > [2] Patch to test:
> > > 
> > > ```
> > > diff --git a/src/stdio/vfprintf.c b/src/stdio/vfprintf.c
> > > index 010041ca6c..01004158bf 100644
> > > --- a/src/stdio/vfprintf.c
> > > +++ b/src/stdio/vfprintf.c
> > > @@ -180,12 +180,8 @@
> > > 
> > >  static int fmt_fp(FILE *f, long double y, int w, int p, int fl, int t, int ps)
> > >  {
> > > -       int bufsize = (ps==BIGLPRE)
> > > -               ? (LDBL_MANT_DIG+28)/29 + 1 +          // mantissa expansion
> > > -                 (LDBL_MAX_EXP+LDBL_MANT_DIG+28+8)/9  // exponent expansion
> > > -               : (DBL_MANT_DIG+28)/29 + 1 +
> > > -                 (DBL_MAX_EXP+DBL_MANT_DIG+28+8)/9;
> > > -       uint32_t big[bufsize];
> > > +       uint32_t big[(LDBL_MANT_DIG+28)/29 + 1          // mantissa expansion
> > > +               + (LDBL_MAX_EXP+LDBL_MANT_DIG+28+8)/9]; // exponent expansion
> > >         uint32_t *a, *d, *r, *z;
> > >         int e2=0, e, i, j, l;
> > >         char buf[9+LDBL_MANT_DIG/4], *s;
> > > ```
> > 
> > Could you try the attached patch?
> > 
> 
> > diff --git a/src/stdio/vfprintf.c b/src/stdio/vfprintf.c
> > index 76733997..2d2f4f3e 100644
> > --- a/src/stdio/vfprintf.c
> > +++ b/src/stdio/vfprintf.c
> > @@ -180,11 +180,10 @@ typedef char compiler_defines_long_double_incorrectly[9-(int)sizeof(long double)
> >  
> >  static int fmt_fp(FILE *f, long double y, int w, int p, int fl, int t, int ps)
> >  {
> > -	int bufsize = (ps==BIGLPRE)
> > -		? (LDBL_MANT_DIG+28)/29 + 1 +          // mantissa expansion
> > -		  (LDBL_MAX_EXP+LDBL_MANT_DIG+28+8)/9  // exponent expansion
> > -		: (DBL_MANT_DIG+28)/29 + 1 +
> > -		  (DBL_MAX_EXP+DBL_MANT_DIG+28+8)/9;
> > +	int max_mant_dig = (ps==BIGLPRE) ? LDBL_MANT_DIG : DBL_MANT_DIG;
> > +	int max_exp = (ps==BIGLPRE) ? LDBL_MAX_EXP : DBL_MAX_EXP;
> > +	int bufsize = (max_mant_dig+28)/29 + 1          // mantissa expansion
> > +	            + (max_exp+max_mant_dig+28+8)/9;    // exponent expansion
> >  	uint32_t big[bufsize];
> >  	uint32_t *a, *d, *r, *z;
> >  	int e2=0, e, i, j, l;
> > @@ -266,7 +265,7 @@ static int fmt_fp(FILE *f, long double y, int w, int p, int fl, int t, int ps)
> >  	if (y) y *= 0x1p28, e2-=28;
> >  
> >  	if (e2<0) a=r=z=big;
> > -	else a=r=z=big+sizeof(big)/sizeof(*big) - LDBL_MANT_DIG - 1;
> > +	else a=r=z=big+sizeof(big)/sizeof(*big) - max_mant_dig - 1;
> >  
> >  	do {
> >  		*z = y;
> 
> To clarify, the root cause here is that the subtraction later down the
> function to make room for the mantissa, fixed in the last quoted hunk
> above, was still using a hardcoded LDBL_MANT_DIG rather than adapting
> to the size of the type being formatted.
> 
> I was able to reproduce the buf on aarch64, albeit with all-zeros
> output instead of the prefixed '3'. Hitting it seems to depend on
> LDBL_MANT_DIG-DBL_MANT_DIG being high enough, which only happens on
> archs with quad ld (not ld80).
> 
> For what it's worth, the math does not look right even with the buffer
> sized for LDBL_MANT_DIG, and it's probably only by luck that it
> worked. We only reserved one b1b (base-1000000000) slot per 29 bits of
> mantissa, but adjusted the end pointer to take away a whole b1b slot
> for each bit of the mantissa, potentially leaving too little room for
> the exponent expansion. In practice this worked because positive
> exponent expansions use a lot less (roughly 1/3) the estimated space;
> it's negative exponent expansions that use the full extimate. (This is
> because each halving adds an extra digit to the end, but it takes
> log2(10) doublings to add a digit to the front.)
> 
> I think the correct code would be something like:
> 
> +	else a=r=z=big+sizeof(big)/sizeof(*big) - (max_mant_dig+28)/29 - 1;

An additional -1 (+1 to the # of slots for mantissa expansion, as in
the commented expression) is needed here because the do/while loop
emits a final zero slot for the mantissa. I'm not sure this is
actually needed later, but as long as it's there it needs to be
accounted for.

And indeed, with some debug instrumentation, empirically
z==buf+bufsize after the initial mantissa expansion loop finishes.

> And maybe this math should be done once at the top and reused rather
> than repeated (i.e. use max_mant_slots instead of max_mant_dig). This
> would also eliminate the need for the comments by documenting the
> intermediate size calculations in variable names.

As a result of the above findings, I'm strongly in favor of doing it
this way. Repeating a confusing expression like this is a formula for
disaster.

I'd also like to think about ways we could avoid introducing bugs like
this in the future. In this case, just a testcase for printing DBL_MAX
in multiple formats would likely have caught it. Simple trapping-only
UBsan might have caught it, but I'm not sure if that would follow all
the positional arithmetic in fmt_fp to detect going outside the object
bounds. This may be worth investigating.

Big thanks to Michal Terepeta for catching this prior to it slipping
into a release!

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.