Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 29 Jul 2020 19:12:52 -0500
From: "A. Wilcox" <awilfox@...lielinux.org>
To: musl@...ts.openwall.com, bug-bison@....org
Subject: Re: Building Bison 3.7 with musl (was Re: portability issues
 with unicodeio)

On 29/07/2020 19:05, Rich Felker wrote:
> On Wed, Jul 29, 2020 at 06:23:19PM -0500, A. Wilcox wrote:
>> Seeing some weird behaviour here building Bison 3.7 on musl libc.
>>
>> Something seems to be "intelligent" enough to know that \u2022 is a
>> bullet character, and is replacing it with "*" instead of ".", causing
>> all the tests to fail:
>>
>> awilcox on gwyn [17] bison: LC_ALL=C /bin/printf '\u2022\n' | od -t x1
>> 0000000 2a 0a
>> 0000002
> 
> I don't think the '*' has anything to do with it being a bullet
> character. It's just the implementation-defined replacement character
> musl's iconv uses.


Ah, ok.


> I would guess the code in bison and coreutils printf is assuming the
> non-conforming glibc behavior for iconv of returning an error if a
> character from the input is not exactly representable in the output,
> rather than making replacements and returning the number of inexact
> conversions made.


Actually, it's assuming iconv will replace \u2022 with '.', and failing
because it isn't:


@@ -1,9 +1,9 @@
 State 0

-    0 $accept: . S $end
-    1 S: . 'a' A 'a'
-    2  | . 'b' A 'b'
-    3  | . 'c' c
+    0 $accept: * S $end
+    1 S: * 'a' A 'a'
+    2  | * 'b' A 'b'
+    3  | * 'c' c

     'a'  shift, and go to state 1
     'b'  shift, and go to state 2



This test gets more and more "fun" the more platforms it's ported to.

--arw

-- 
A. Wilcox (awilfox)
Project Lead, Adélie Linux
https://www.adelielinux.org



Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.