Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADDzAfPvGXXx8vvsKAR0XenzpNQx4srF2hGoVxGuqOed-2ByqA@mail.gmail.com>
Date: Sat, 5 Apr 2025 07:32:37 +0800
From: Kang-Che Sung <explorer09@...il.com>
To: musl@...ts.openwall.com
Subject: Re: wcrtomb in UTF-8 locale should check the multibyte state

Hi.

On Sat, Apr 5, 2025 at 5:39 AM Thorsten Glaser <tg@...lvis.org> wrote:
>
> On Sat, 5 Apr 2025, Kang-Che Sung wrote:
>
> >Note: It is _allowed_ in the C standard to reuse an mbstate_t object
> >across different multibyte conversion functions. It is _not an
>
> 7.31.6 begs to differ:
>
> | If an mbstate_t object has been altered by any of the functions
> | described in this subclause, and is then used with a different
> | multibyte character sequence, or in the other conversion direction, or
> | with a different LC_CTYPE category setting than on earlier function
> | calls, the behavior is undefined.414)
>

I'm aware of that part of the standard paragraph.
I may have read it wrongly regarding the meaning of the "conversion
direction", but I still believe that ignoring the mbstate_t object is
a bad idea.

I need to make a correction on one thing though:
In macOS, the wcrtomb call in the example code in my last email
actually sets errno=EINVAL, not EILSEQ.
I guess some BSD implementations also follow this (I'm not sure).
POSIX says "EINVAL: ps points to an object that contains an invalid
conversion state."

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.