Date: Tue, 18 Feb 2020 22:36:04 -0500 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: Locale support considered harmful noise On Tue, Feb 18, 2020 at 07:38:29PM +0000, Jacob Welsh wrote: > Hello, > > In TMSR we've made extensive use of musl, due to the very welcome > dose of clear and concise code it provides as compared to the > competition . For example we have a static Ada compiler , the > Bitcoin reference implementation , a reproducible and > self-contained Gentoo system , and not least of all my own > distribution  used in my consulting business . > > However, the apparent goal of aggressive expansion of Unicode and > localization "features" in musl sets off alarms; for instance, on > the roadmap  I see: I think you're rather under-informed on this topic. Basically none of the following add any complexity: > >Unicode 12.1 update and related character handling work This was (1) an update of existing tables and (2) throwing out hand-written case mapping code that made lots of fragile assumptions and had to be updated by hand with every addition of new case mappings, and that got slower with each addition, and replacing it with a table-based approach I'd designed a year or so ago that's more like the rest of the character tables and admits automatic generation. > >Locale support overhaul. This is not adding anything new but fixing bugs where the code that's already there doesn't work as intended. > >Hostname resolver support for non-ASCII domains (IDN) > > >LC_COLLATE support for collation orders other than simple codepoint order These have been serious missing functionality since the beginning. There is no change here. If you missed them being on the roadmap for the past 6+ years, you weren't looking very closely. > >Support for LC_MONETARY and LC_NUMERIC properties. This is the only item that's controversial, but you don't seem to be coming from a good position to have input on it. > >Message translation support for dynamic linker This has also been on the agenda for a long time. It's the only place in musl where format strings containing natural-language text are used, and format strings are not candidates for translation because it's unsafe (data can replace format specifiers with incompatible ones), making it inconsistent with the rest of musl which does have message translation support. > >Locale data and libc message translations This is purely a matter of creating data to be used with functionality that already exists. > We think this is such a bad idea that it threatens to undermine > musl's otherwise substantial virtues. This kind of bloat imposes > real costs on the users that matter - namely the literate ones, who > value predictable, stable and bug-free code - in exchange for > entirely unclear benefits. If you think the above imply bloat, musl must already be bloated. You should probably be aware that first-class support for all characters in Unicode (vs glibc's bloated gconv-plugin layer for UTF-8 which originally made GNU grep over 100x slower than in 8-bit codepage locales) was _THE_ original motivation for what became musl. None of this is new. Not treating users like they're "illiterate" if they want to be able to write their own name has always been the most important core value of the project, and your attitude towards the matter here does not make me interested in going out of my way to cater to you. I suspect others in this community feel similarly. > Especially considering the rate at which bugs are still turning up, > there is no justification for this added complexity. In any event we > will not be using "upgrades" that import additional nonsense into > this critical system component. If you want to stick with old versions and maintain them yourself or pay someone else to do so, that's your choice. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.