Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 20 Sep 2022 08:28:29 -0400
From: Rich Felker <dalias@...c.org>
To: Jₑₙₛ Gustedt <jens.gustedt@...ia.fr>
Cc: musl@...ts.openwall.com
Subject: Re: [PATCH] vfprintf: support C2x %b and %B conversion
 specifiers

On Tue, Sep 20, 2022 at 11:19:34AM +0200, Jₑₙₛ Gustedt wrote:
> Rich,
> 
> on Mon, 19 Sep 2022 14:10:39 -0400 you (Rich Felker <dalias@...c.org>)
> wrote:
> 
> > On Mon, Sep 19, 2022 at 07:59:52PM +0200, Szabolcs Nagy wrote:
> > > * Rich Felker <dalias@...c.org> [2022-09-19 11:09:17 -0400]:  
> > > > On Mon, Sep 12, 2022 at 04:42:51PM +0200, Jₑₙₛ Gustedt wrote:  
> >  [...]  
> > > > 
> > > > What do these entail? It looks like there's a requirement for
> > > > printf to support them, so I don't see how you'd do that as a
> > > > separate library. It looks like __STDC_IEC_60559_DFP__ is
> > > > optional though, so maybe we could just decline to define it and
> > > > leave the support sporadic at the level the compiler supports, as
> > > > an extension rather than part of the standard-specified
> > > > functionality?  
> > > 
> > > it seems there is
> > > https://github.com/libdfp/libdfp/tree/master/printf-hooks
> > > using glibc specific apis (register_printf_specifier)
> > > 
> > > i'm not sure how musl can handle this internally since
> > > we dont know in advance if the user links against libdfp.  
> > 
> > Yeah, I don't see that as being a usable approach. It's closely tied
> > to the glibc printf model that's not usable in bounded memory with
> > arbitrary width and precision, and not compatible with linking
> > semantics as you mention. The amount of code needed for decimal float
> > printing in decimal is miniscule anyway and something we can easily do
> > with no actual decimal floating point code. I thought the hard case
> > was hex, but looking at the spec again, %a doesn't actually do hex for
> > decimal floats, so it should be easy too.
> 
> Yes exactly. There is nothing conceptually difficult here and nothing
> that should not be in some form or another already in every C library.
> 
> So yes, sorry, for the separate library part I forgot formated IO and
> string functions. But the huge amount of functions that are added for
> these types are math functions (I guess something like 600 or so)
> stepping on user's identifier space all over.

Yes, I think it's fine for now to have a separate math library for the
math functions. Otherwise the work of adding these interfaces becomes
rather prohibitive. I would assume they're all pure functions where
correct implementations are basically interchangable, so I don't see a
lot of value in insisting these "go with" libc.

> Unfortunately, again as for complex types, the standard doesn't
> properly distinguish language support for the new optional types and
> library support. I really would have preferred to have the whole thing
> in a separate header, but my voice echoed in the void. There are the
> `__STDC_VERSION_…_H__` macros now, so this gives at least some sort of
> feature test.

I can see both viewpoints as having good motivation, but yes it's
frustrating.

> But for implementing the parts that are outside of math, things should
> indeed not be so difficult. gcc has support for the types since long,
> I think, and should also provide predefined macros that could be used
> to check for language support. Then, the types themselves have clear
> definition and prescribed representation, the ABI is de-facto sorted
> out, so there would be not much other implementation dependency to
> worry about.

The thing is we don't have the option to "check for language support".
Doing that would mean you get a deficient musl build if your compiler
doesn't have the language features, so essentially we'd be requiring
bleeding-edge gcc or clang (dropping all other-compiler support at the
same time) to get a properly featured libc.so that's capable of
supporting arbitrary musl-linked binaries.

This is why we're going to need asm thunks for performing va_arg with
the new types and (programmatically generated, I assume) asm entry
thunks for accepting arguments to any non-variadic functions, which
can convert (ideally as a no-op) the decimal float type arguments to
integer-type or struct arguments the underlying implementation files
would then receive.

> Other types that come with C23, and these are mandatory, are
> bit-precise integers. There the support by compilers is probably not
> yet completely established. I know of an integration into llvm, but I
> am not sure about the state of affairs for gcc, nor if there is a
> de-facto agreement on ABI issues. In any case, these types need
> support in formatted IO, too.

As far as I can tell, the draft standard makes printf support for all
but the ones defined as [u]intNN_t a choice for the implementation, so
the obvious choice is not to support any additional ones.

> Also, C23, provides the possibility for extended integer types that
> are wider than `[u]intmax_t` under some conditions. This is intended
> in particular to allow for implementations such as gcc on x86_64 to
> interface the existing 128 bit integer types properly as
> `[u]int128_t`. From a C library POV, these then also would need
> integration into formatted IO, but here again support in the compiler
> with usable feature test macros is there for ages and the ABI should
> already be sorted out.

Yes. I haven't followed the latest on this but my leaning was to leave
them as "compiler extensions" that don't count as "extended integer
types". However presumably they could be handled the same way as
decimal floats if needed.

> So in summary that means that there is some work to do to make
> formatted IO of C libraries become compliant with C23. Let me know if
> and where I could help to make that happen for musl.

The big issue is probably collating the list of what's actually needed
to meet requirements, and what the ABIs for them are. If there's
cross-arch agreement on a general pattern ABIs follow for them, that
would be wonderful, and even if not entirely so, a general pattern
would advise how we structure the underlying functions (to make thunks
as minimal as possible on the largest number of archs).

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.