|
|
Message-ID: <4aed8a2e-22e5-708f-1275-9814a8ffc339@redhat.com> Date: Thu, 5 Feb 2026 21:22:54 +0000 (UTC) From: Joseph Myers <josmyers@...hat.com> To: libc-coord@...ts.openwall.com cc: Keith Packard <keithp@...thp.com> Subject: Re: c8rtowc and wcrtoc8 On Thu, 5 Feb 2026, Florian Weimer wrote: > * Keith Packard: > > > Because the char8_t encoding is not stateless, something like mbstate_t > > is required. > > Is this accurate? I thought that conversion from and to UTF-8 would > only operate on complete multibyte sequences. Indeed, the version of "Restartable Functions for Efficient Character Conversion" that was actually accepted into C2y (N3366 plus an editorial correction) is explicit that "For the UTF-8, UTF-16, and UTF-32 encodings, collectively referred to as the Unicode encodings, an indivisible unit of work for a read operation shall be the sequence of code units that corresponds to one Unicode code point.". -- Joseph S. Myers josmyers@...hat.com
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.