Date: Mon, 5 Mar 2018 21:42:49 +0300 From: "Konstantin P." <ria.freelander@...il.com> To: musl@...ts.openwall.com Subject: Re: Draft proposed locale changes Can you publish official po file for musl after proposed changes? On Mon, Mar 5, 2018 at 9:39 PM, Rich Felker <dalias@...c.org> wrote: > > localeconv/LC_NUMERIC/LC_MONETARY > > Each loaded locale needs an immutable lconv structure to represent > this data. It needs to be allocated with the locale (at locale loading > time) since localeconv() has no provision for failure, but we can wait > to populate it lazily, and we can put the code to populate it in > localeconv.c so that static-linked programs that don't use this > rarely-used interface don't have to pay for it. We could also omit > even allocating it (56/96 bytes) if localeconv.o is not linked, but > it's probably not worth the special-casing code to do that. > > The localeconv structure should be part of struct __locale_map, not > struct __locale_struct, since it's a pure function of the data in the > memory-mapped locale file and not a function of how that data is > linked to a specific locale category. Putting it in __locale_struct > would just complicate setlocale and newlocale. > > The obvious (but not terribly efficient) form for the data in the > locale file is to have each lconv field as a mo-level key, as in: > > msgid "int_frac_digits" > msgstr "2" > > A more compact form could pack them all into one, but then the order > becomes a hidden locale-file interface boundary/ABI. > > For the string fields it's necessary that they each be in-place > strings in the mo file. grouping and mon_grouping also have the > special constraint that they need to vary by whether the arch uses > signed or unsigned plain-char (since CHAR_MAX has special meaning) so > the mo file needs to store both versions. That's ugly but I don't see > any good way around it. We can probably punt on this for now just by > not supporting grouping (i.e. only supporting locale definitions that > don't do grouping), since it's not implemented anyway. > > If we support decimal_point, it should not go through the localeconv > mechanism since it would always be needed by printf and strtod. > Instead __get_locale should probe it right away and set a 1-bit flag > in the __locale_map structure for these functions to consume (1-bit > based on previous research that [.,] are the only values). > > > > nl_langinfo/LC_TIME/etc. > > Eliminate the currently-present wrong values for ERA* and related > LC_TIME stuff; that gets rid of all ambiguous translation keys except > "May". Bikeshed up some alternate key for May. > > > > strerror/LC_MESSAGES > > Not sure yet. One radical idea I kinda like is removing all the > English-phrase messages from libc core and just having strerror > produce strings like "ENOENT", "EPERM", etc. in the C locale. This > seems to be the only option that wouldn't either moderately increase > libc size or require translation files to match the exact current text > in the builtin English libc messages. Users who want the current > messages would then need an "en" locale with contents like: > > msgid "ENOENT" > msgstr "No such file or directory" > > If we don't want this, the possible solutions look like one of: > > 1. Prepending the error code and a null byte (e.g. "ENOENT\0") to all > the existing error strings, then skipping past it if the translation > was not found. > > 2. Putting a second version of strerror in locale_map.c with the E* > names in it, so it's only linked if you use locale. I strongly dislike > this approach because it greatly increases the marginal size cost of > doing the right thing (calling setlocale) and imposes the cost even if > you don't use strerror at all (only setlocale). > > 3. Accepting that translations need to match (and perpetually be > updated to match) error strings in musl __strerror.h. I don't like > this much either. > > So I think it should be between options 1 and "zero" above. Option > zero decreases the size of libc by nearly 1k (removing messages) but > changes the behavior. Option 1 increases the size of libc by about 1k. > > > > LC_COLLATE > > No specific proposal yet. We need a data structure to map characters > and sequences of characters to collating elements. Obviously the mo > file's lookups could be used directly (O(log n), improved avg case if > we ever add hash table support) but they might be heavier than we > want. The alternative would be having a gigantic string in the mo file > that's just "compiled" collation table data, but unless it's > well-designed that seems like an undesirable permanent interface > boundary. > > Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.