Date: Fri, 25 Oct 2019 10:15:14 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: [PATCH] Update ctype data to Unicode 12.1.0 On Wed, Oct 23, 2019 at 07:21:35PM +0300, Eleftherios Kritikos wrote: > Hi all, > > I wanted to mention that I have used the code for `wcwidth` and for > generating Unicode data tables from musl in the Haskell library > vty (a ncurses style library). > > Relevant files in the MR: > * https://github.com/jtdaugherty/vty/pull/179/files#diff-ab3908e00d1c13397ed03e5c2213ad8bR5 > * https://github.com/jtdaugherty/vty/pull/179/files#diff-a06fd5aeeca6d7dac0278c2537eb1950R1 > * https://github.com/jtdaugherty/vty/pull/179/files#diff-86acb7ffecd1a09c5f55892bd0ce13b1R1 > * https://github.com/jtdaugherty/vty/pull/179/files#diff-dc77683ad25ad6f509fb58a397c93f4aR1 > * https://github.com/jtdaugherty/vty/pull/179/files#diff-9879d6db96fd29134fc802214163b95aR32 > > Thanks Rich Felker and everyone else for all the good work that has > gone into musl! > > Please let me know if you think attribution was not properly given. > > 1.http://git.musl-libc.org/cgit/musl/tree/src/ctype/wcwidth.c?id=9b2921bea1d5017832e1b45d1fd64220047a9802 > 2.https://github.com/richfelker/musl-chartable-tools/tree/master/ctype > 3. https://github.com/jtdaugherty/vty Great! I love seeing code/concepts from musl getting adopted elsewhere especially in places where the classic solutions were all much larger. Just a quick update on why I haven't merged this yet: I went to do the case mappings too, and found that at least one range, I believe the one that would be CASEMAP(0x1c90,0x1cba,0x10d0), is not representable in the current code that requires updating by hand (it could be done on a char-by-char basis but continuing to expand that part makes the file grow larger and slower very quickly). So, I'm pulling back up the proposed replacement code from April 2018 that never got finished and merged. The old thread is here: https://www.openwall.com/lists/musl/2018/04/05/1 It's moderately larger -- ~4.8k instead of ~1.5k for Unicode 10 -- but O(1) rather than O(n) (n = # of case mappings), about 10x faster, and programmatically generated from UnicodeData.txt. I'll add the (awful, ugly, just like everything else in musl-chartable-tools) code for generating the table to musl-chartable-tools when I merge it so it's not a black box. I have it working now, so as long as I don't hit any unexpected problems testing I'll get this (and your patch, and updating case mappings to Unicode 12) merged soon. Thanks again for sending the patch and pinging this. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.