Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260403200648.GQ1827@brightrain.aerifal.cx>
Date: Fri, 3 Apr 2026 16:06:48 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: iconv GB18030 DoS issue

On Fri, Apr 03, 2026 at 12:01:25AM -0400, Rich Felker wrote:
> there are still a small number of mappings that are incorrect due to
> late changes made in the definition of gb18030, swapping PUA
> codepoints with proper Unicode characters. correcting these requires a
> postprocessing step that will be added later.

To be specific, what we currently implement is the origina 2000
version of GB 18030. Subsequent changes were made in 2005 and 2022,
swapping PUA codepoints that were wrongly used for characters that
Unicode hadn't assigned yet with the new assignments to put the
correct characters in the 2-byte range. I've now validated what's in
musl to match the normative GB 18030-2000 xml definition at
https://raw.githubusercontent.com/unicode-org/icu-data/tags/tzu-1-4-0/charset/data/xml/gb-18030-2000.xml
so applying the swaps to get the 2022 version should be easy if we
want to.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.