musl - Re: Re: a bug in bindtextdomain() and strip '.UTF-8'

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAPG2z0-8NoD14Mm=ErQ2SO1A+HSV1McyFxSRUfB4X_nPiXB-eA@mail.gmail.com>
Date: Mon, 13 Feb 2017 22:06:49 +0800
From: He X <xw897002528@...il.com>
To: musl@...ts.openwall.com
Subject: Re: Re: a bug in bindtextdomain() and strip '.UTF-8'

no, it's on musl, i just tested it with my patches, with vim, stripping
will lead to unknown characters.
I mean, .mo files under zh_CN/ of vim is GBK set, while zh_CN/ of other
apps is UTF-8 set, that meas there may be other apps like vim, we should be
more cautious, add a check before map the .mo files, and fail non-UTF8 set
in setlocale.

Btw, _nl_msg_cat_cntr & _nl_domain_bindings will block apps compiling with
the native intl of musl, and after i added a dump for these two symbols,
gnu tar showed me segfaults, because he passed a zero msgid1 causing
__mo_lookup segfault, we should add a check in dcngettext to avoid it(if
(!msgid1) goto notrans;):

 #2  0x00007ffff7d82a6f in dcngettext (domainname=0x6737a0 "tar",
msgid1=0x0, msgid2=0x0, n=1,
    category=5) at src/locale/dcngettext.c:211


2017-02-13 21:28 GMT+08:00 Rich Felker <dalias@...c.org>:

> On Mon, Feb 13, 2017 at 04:01:31PM +0800, He X wrote:
> > New find, as you can see, zh_CN is different from zh_CN.UTF-8, it's GBK
> > codeset, we can't strip .UTF-8 easily, or we will get a lot of junk:
>
> That's on glibc; your "finding" is irrelevant to musl, where the
> encoding for all locales is UTF-8.
>
> Rich
>

Content of type "text/html" skipped

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.