Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Mon, 12 Jun 2023 17:28:58 -0400
From: Rich Felker <dalias@...c.org>
To: Bruno Haible <bruno@...sp.org>
Cc: musl@...ts.openwall.com
Subject: Re: swprintf %lc directive does not work for some wide
 characters

On Mon, Jun 12, 2023 at 10:53:24PM +0200, Bruno Haible wrote:
> Rich Felker wrote:
> > Per my reading of the specification, this is not a bug but is the
> > expected behavior.
> > 
> >     In addition, all forms of fwprintf() shall fail if:
> > 
> >     [EILSEQ]
> >             A wide-character code that does not correspond to a valid
> >             character has been detected.
> 
> From my reading of ISO C, it's a bug. Namely, in ISO C 23 § 7.31.2.3
> the error conditions are specified as
>   "The swprintf function returns the number of wide characters written
>    in the array, not counting the terminating null wide character,
>    or a negative value if
>      an encoding error occurred
>      or if n or more wide characters were requested to be written."
> 
> In swprintf, where "the wint_t argument converted to wchar_t" is written
> and the output is to a wchar_t[], no "encoding error" should be possible.
> That's obvious. The "encoding errors" occur in %c and %s directives,
> AFAIU, not in %lc and %ls directives.

You're reading this "obvious" thing that is not present in the
specification into it. I don't have the exact same text you're looking
at in front of me at the moment, but what I have from the current
standard (C11) is:

7.29.2.1 ¶14:

    "The fwprintf function returns the number of wide characters
    transmitted, or a negative value if an output or encoding error
    occurred."

7.29.2.3 ¶2:

    "The swprintf function is equivalent to fwprintf, except that the
    argument s specifies an array of wide characters into which the
    generated output is to be written, rather than written to a
    stream."

I read "equivalent to fwprintf..." as allowing swprintf to return an
error in any case where fwprintf would unless the "except..."
explicitly forbid one or more (which it doesn't).

Since POSIX aims not to conflict with ISO C, I would think the POSIX
position is also that this requirement does not conflict, but is
intended to allow for both implementations that don't detect the
encoding error (ones which use a wchar_t[] buffer) and ones that do
(ones which use a char[] buffer).

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.