Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Wed, 26 Feb 2014 11:58:49 +0000
From: Alan Hourihane <alanh@...rlite.co.uk>
To: musl@...ts.openwall.com
Subject: Re: CP850 & IBM850 codepages

On 02/25/14 22:39, Rich Felker wrote:
> On Tue, Feb 25, 2014 at 10:31:46PM +0000, Alan Hourihane wrote:
>>> Adding cp850 and other DOS codepages should not be hard and should not
>>> take up much additional size in iconv, but it's also nontrivial to do
>>> without my tools to generate the tables, which are not published.
>>> Publishing them is something I should really get around to doing,
>>> since their absence affects the ability of others to modify the code
>>> in meaningful ways; I need to apologize for not doing so already.
>>>
>> O.k. that makes sense as I couldn't understand the format. :-)
> The format is basically this: legacy_chars is a table of all
> codepoints that ever appear in a supported legacy codepage, with a
> limit of 1024 total codepoints. The individual codepage tables are 10
> bits per entry and map into this table, and they omit the initial
> subrange that's identical to latin1 (and thus a one-to-one mapping to
> unicode). I have tools that automatically generate these from the
> unicode txt files containing the mappings.
>

Thanks Rich. I'll keep an eye out for the cp850/ibm850 table to land
when you've had chance with your tools.

Alan.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.