![]() |
|
Message-ID: <20250429193148.GX1827@brightrain.aerifal.cx>
Date: Tue, 29 Apr 2025 15:31:49 -0400
From: Rich Felker <dalias@...c.org>
To: Kang-Che Sung <explorer09@...il.com>
Cc: musl@...ts.openwall.com, Alejandro Colomar <alx@...nel.org>
Subject: Re: mbsnrtowcs(3) behavior not compatible with POSIX.1-2024
On Tue, Apr 29, 2025 at 05:14:54PM +0800, Kang-Che Sung wrote:
> Hi, musl libc developers,
>
> I just tested the mbsnrtowcs function in musl libc and discovered there is one
> behavior that is not compatible with the new POSIX.1-2024 standard.
>
> It's this thing: POSIX.1-2017 stated
> "If the input buffer ends with an incomplete character, it is unspecified
> whether conversion stops at the end of the previous character (if any), or at
> the end of the input buffer.
> [...] A future version may require that when the input buffer ends with an
> incomplete character, conversion stops at the end of the input buffer."
> (Reference: https://pubs.opengroup.org/onlinepubs/9699919799/functions/mbsrtowcs.html)
>
> POSIX.1-2024 now requires the conversion stop at the end of the input buffer in
> that case.
> (https://pubs.opengroup.org/onlinepubs/9799919799/functions/mbsrtowcs.html)
> (https://www.austingroupbugs.net/view.php?id=616)
>
> Test code
>
> ```c
> #include <locale.h>
> #include <stdio.h>
> #include <string.h>
> #include <wchar.h>
>
> wchar_t wcs[100];
> char mbs[100];
>
> int main()
> {
> mbstate_t state; const char *s;
> setlocale(LC_CTYPE, "en_US.UTF-8");
>
> memset(&state, 0, sizeof(state));
> // U+754C U+7DDA
> memcpy(mbs, "\xe7\x95\x8c\xe7\xb7\x9a", 7);
> s = mbs;
> printf("%zu, ", mbsnrtowcs(wcs, &s, 5, 100, &state));
> printf("%td\n", s - mbs);
> // Expected output: "1, 5". Actual output in musl: "1, 3".
>
> memset(&state, 0, sizeof(state));
> memcpy(mbs, "\xe7\x95\x8c\xe7\xb7", 6);
> s = mbs;
> printf("%zu, ", mbsnrtowcs(wcs, &s, 6, 100, &state));
> printf("%td\n", s - mbs);
> // Expected output: "18446744073709551615, 3"
> }
> ```
>
> By the way, I Cc'd the Linux man pages' maintainer as I plan to suggest a patch
> to the mbsnrtowcs(3) man page. And it would be good to see the behaviors of
> mbsnrtowcs consistent between glibc and musl libc.
Does the attached patch (untested) fix it?
Rich
View attachment "mbsnrtowcs.diff" of type "text/plain" (810 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.