Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251002023414.GG1827@brightrain.aerifal.cx>
Date: Wed, 1 Oct 2025 22:34:14 -0400
From: Rich Felker <dalias@...c.org>
To: Pablo Correa Gomez <pabloyoyoista@...tmarketos.org>
Cc: musl@...ts.openwall.com
Subject: Re: Selecting locale source format

On Wed, Oct 01, 2025 at 03:55:59PM +0200, Pablo Correa Gomez wrote:
> We got now a few replies from translators, and the most remarkable
> thing that was brought up is how to deal with natural text whose
> translations might change depending on context. Both plural forms and
> declinations were brought up.
> 
> Discussing a bit with Rich, it seems that such thing will not be an
> issue for strings related to the libc API, which is what is the biggest
> concern of the work we are doing now. However, there are
> implementation-dependent strings in libc, like dynamic linker messages,
> which could potentially be added in the future. Still, since we are
> setting the file format, it would be important to make sure that
> whatever we come up now is flexible enough to not block future
> development. Any thoughts?

To summarize that discussion, most of the translatable strings in
musl/libc are fixed-form messages returned to the caller to use as it
sees fit. These inherently don't have any plural or other contextual
forms because no context is available to us.

What's left are strings that are not themselves part of any standard
interface surface but where we're reporting things directly to the
user (something we mostly choose very intentionally not to do, with
the main exception being dynamic linking failure conditions at
startup) or where the interface allows us to construct a more detailed
and contextualized message (presently this is just dlerror).

The reason the first try at supporting localized text omitted the
dynamic linker (startup and dlopen) strings from being translatable by
gettext was pretty much specifically this: that I did not want
safety/correctness to depend on having type-matched format strings in
the locale definition file. My intent then was that, if/when we make
them translatable, we adjust the messages to be less
"natural-language" in form and instead consist of
separately-translatable fields, something like:

	Relocation error: foo.so: symbol: [cause of failure]

I'm not entirely committed to this if other folks disagree, but I
think it both makes translation cleaner and makes it easier to
understand bug reports with messages in a language you might not read.

If we don't do it this way, I'd want to have an internal interface for
validating that format strings are type-matched before using them. In
that case, if there are variants needed, we'd have to enumerate them
and assign them each an integer key.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.