Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 11 Sep 2019 11:38:53 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: printf doesn't respect locale

On Wed, Sep 11, 2019 at 05:15:45PM +0200, Jens Gustedt wrote:
> Hello Rich,
> 
> On Wed, 11 Sep 2019 09:47:27 -0400 Rich Felker <dalias@...c.org> wrote:
> 
> > > > An alternative/additional solution, which I actually might like
> > > > better, is having a function which sets a thread-local flag to
> > > > treat certain locale properties (at least the problematic
> > > > LC_NUMERIC ones) as if the current locale were "C". This is
> > > > weaker than the uselocale API from POSIX, but doesn't have the
> > > > problems with the possibility of failure (likely with no way to
> > > > make forward progress) like it does, and more importantly, would
> > > > avoid *breaking* m17n/i18n functionality by turning off other
> > > > unrelated, non-problematic locale features. Application or
> > > > library code could then just set/restore this flag around
> > > > *printf/*scanf/strto*/etc calls, or could set it and leave it if
> > > > they never want to see ',' again.  
> > > 
> > > Interesting.
> > > 
> > > Would this be difficult to implement in musl? (I guess not)  
> > 
> > I would think not, but I'd have to look at the details a little more.
> > 
> > One other advantage of this approach is that it has a more graceful
> > fallback. If an application needs portable LC_NUMERIC behavior, it can
> > check at build time for the presence of the new interface. If present,
> > LC_NUMERIC can be set to "" (user's preference) and the new interface
> > can be used to get the needed behavior. If absent, the application can
> > refrain from setting LC_NUMERIC, only setting the other categories and
> > leaving it as "C" (default).
> > 
> > Note that having it be thread-locally stateful is, in my opinion, much
> > better than having new variants of the affected functions or new
> > formats, since a caller using LC_NUMERIC can set/restore the state to
> > safely call library code that's completely unaware of the new
> > interfaces.
> > 
> > Of course there may be complications I haven't thought of. One that
> > comes to mind right away is what localeconv() should return under such
> > conditions.
> 
> Ok, yes so this path sounds much more promissing than to concur with
> all the different parties to find a free modification character, and
> agree on the semantics.
> 
> > > Would you be willing to write this up?  
> > 
> > What form would it need to be in?
> 
> At the end this should be an N-document to submit to WG14, but that is
> really at the end. Just one or two pages would be good to get perhaps
> some discussion going, first, and also make it clear what it would
> imply for and need from musl.
> 
> Do you think that a highlevel implementation using _Thread_local or
> (tss calls) and setlocale would be doable, such that we could even
> provide a reference implementation for all POSIX systems that also
> implement some form of thread local variables?

It can't be done in terms of setlocale because setlocale is not
thread-safe or thread-local. It could be done in terms of POSIX
uselocale, but such an implementation would not be fail-safe -- it
needs to be able to allocate a locale_t object via duplocale, since
the uselocale API works with a locale_t objects that describe the
value of *all* locale categories, rather than the categories being
individually settable on a per-thread basis (this is a design flaw in
the POSIX interfaces, and the historic xlocale ones they were based
on, IMO).

So such an implementation could be a pseudo-code/demo of the
functionality, but I think I'd want the proposed functionality to be
always-succeeds to discourage erroneous code that ignores the result
(resulting in wrong formatting/parsing, which is unsafe) or aborts the
program (eew).

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.