Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 23 Jul 2014 11:50:31 +0200
From: u-igbb@...ey.se
To: musl@...ts.openwall.com
Subject: Re: Locale bikeshed time

On Tue, Jul 22, 2014 at 04:35:40PM -0400, Rich Felker wrote:
> Having one variable configure multiple things is usually error-prone
> and inflexible.

+1

> > By the way, please do not follow the way of a single big file.
> > For systems which rely on file boundaries to reflect data clustering

> I hadn't even considered this aspect, but I think the whole concept of
> a single big file is undesirable with data that's naturally subject to
> change over time, and where the data comes from multiple sources. So I
> wasn't really considering that option anyway.

Nice.

> > I actually do mix categories from different locales.
> > No problem as long as the files are small.
> 
> Note that if you're just mixing "ll_TT" and "C", there wouldn't be any
> cost anyway since the C locale (and its aliases) are builtin and never
> loaded from a file. Where I was thinking you might see duplication is

Sure. This covers certainly most of my preferences but I thought of
LANG=l1_T1 and LC_SOMETHING=l2_T2 [and LC_SOMETHINGELSE=l3_T3].
This would result in pulling in two or three locale data files but the
overhead is presumably negligible.

> for things like: LC_ALL=ll_TT@...ifier where modifier is really just
> an alternate for one category (e.g. ISO date format for time, alt
> collation order, etc.), but the file ends up storing duplicates of all
> the data from other categories. However, I think the alternate
> preferred usage here would be to provide a file for just the category
> being overridden that does not contain the base data and require users
> to set the individual categories, like what you're doing, e.g.

> LANG=ll_TT LC_TIME=ll_TT@...date

This means that most of the time there will be a single locale file to be
opened, sometimes more, in extreme cases up to the number of categories,
the files also being of different "completeness". This would certainly
contribute to confusion for both the administrators and the users.

For the sake of uniformity I would possibly prefer to see only the
"thinner" files defining exactly one category, instead of different
files having different numbers of included categories.

But most of all I'd support your approach of including all information in
each file. This is "least confusing" and quite efficient. The overhead
is mostly static storage (not noticeable in our setup and probably not
much anyway :) and the run time overhead affects just the minority of
users who mix locales/categories. (Oh btw as a nice bonus this makes
the file boundaries correspond to the data usage patterns).

To summarize my view,

- a file per locale, with all categories included  best
- a file per category                              acceptable
- files with differing data subsets                please don't

> rather than:
> 
> LC_ALL=ll_TT@...date

In a real scenario it would be probably
 LANG=ll_TT@...date
and this feels OK.

> Rich

Regards,
Rune

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.