Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Thu, 22 Jun 2023 19:51:53 -0400
From: Rich Felker <dalias@...c.org>
To: "Alex Xu (Hello71)" <alex_y_xu@...oo.ca>
Cc: musl@...ts.openwall.com, Markus Wichmann <nullplan@....net>
Subject: Re: [PATCH] [RFC] trap on invalid printf formats

On Thu, Jun 22, 2023 at 07:37:22PM -0400, Alex Xu (Hello71) wrote:
> Excerpts from Rich Felker's message of June 22, 2023 10:45 am:
> > FWIW I don't think there are a lot of these cases left in the wild at
> > all, but I'm not sure. it might be nice to do some distro-wide testing
> > with this patch applied (which is what I had in mind posting it) and
> > see if any problems are caught before really considering whether to
> > pursue upstreaming it.
> 
> Unfortunately, it seems fairly widespread:
> https://codesearch.debian.net/search?q=printf.*%5B%25%5DL%5Bud%5D+-package%3Agcc+-package%3Allvm*+path%3A.*%5C.c%24&literal=0
> 
> The most painful example:
> 
> #if defined(__OpenBSD__) || defined(__FreeBSD__) || defined(__APPLE_CC__) || defined(__APPLE__) || defined(ARGUS_SOLARIS)
>     sprintf (pbuf, "%llu", value);
> #else
>     sprintf (pbuf, "%Lu", value);
> #endif
> 
> (copied and pasted 17 times in the same file, of course)

Which file are you looking at? Some of the results I see are Linux
(the kernel) using its own very-nonstandard printf format specifier
system. Those aren't relevant at all.

All of the rest are bugs where the software is silently malfunctioning
now. Some of these are junk code like examples/* etc. but presumably a
lot are actual bugs that need to be fixed, where something is
malfunctioning now.

> I did some research and the most likely source of %Lu is the Linux 
> man-pages, which, before 1999 or thereabouts, said:
> 
> > • The optional character l (ell) specifying that a following d, i, o, 
> > u, x, or X conversion applies to a pointer to a long int or unsigned 
> > long int argument, or that a following n conversion corresponds to a 
> > pointer to a long int argument.  Linux provides a non ANSI compliant 
> > use of two l flags as a synonym to q or L.  Thus ll can be used in 
> > combination with float conversions.  *This usage is, however, strongly 
> > discouraged.*
> >
> > • The character L specifying that a following e, E, f, g, or G 
> > conversion corresponds to a long double argument, or a following d, i, 
> > o, u, x, or X conversion corresponds to a long long argument.  Note 
> > that long long is not specified in ANSI C and therefore not portable 
> > to all architectures.
> 
> Emphasis added. So, pre-C99, L was in fact the recommended modifier for 
> long long.

As I read it, the usage that was discouraged was using ll as an alias
for L with float conversions. Not use of ll where it was the right
form.

In any case, I'm not sure what we can take away from the history. ll
was the form that ended up getting standardized (probably because it
didn't "steal" any additional letter that might be used for other
things on existing implementations or in the future), and both q and
the overloading of L got rejected.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.