![]() |
|
Message-ID: <20250529155930.GA1827@brightrain.aerifal.cx> Date: Thu, 29 May 2025 11:59:30 -0400 From: Rich Felker <dalias@...c.org> To: Thorsten Glaser <tg@...bsd.de> Cc: musl@...ts.openwall.com Subject: Re: Collation, IDN, and Unicode normalization On Thu, May 29, 2025 at 02:15:59PM +0000, Thorsten Glaser wrote: > Rich Felker dixit: > > >Extending that further, there are only 135 64-codepoint blocks. But 84 > >of those just have 1-3 codepoints in them, the rest still being dead > > Is it an option to do blocks on the higher level and once you reach > the block of a certain size (64? 128? 256?) do a linear search, if > most really have that few codepoints? > > Hm probably would suck for the 0300 block though. > > Just had that crazy idea, not sure if it’s viable. Yeah that's the problem with linear (or rather binary, no reason to do linear) search options: the places you take a performance hit are exactly the ones where all the common characters lie. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.