|
|
Message-ID: <af-GWTs9GRMlOlt1@mail.gmail.com>
Date: Sat, 9 May 2026 21:09:13 +0200
From: Luca Kellermann <mailto.luca.kellermann@...il.com>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: musl multi-level table format for binary locale images
On Fri, May 08, 2026 at 11:22:28PM -0400, Rich Felker wrote:
> [...]
>
> Table structure;
>
> be32 start;
> u8 shift;
> u8 scale;
> be16 size;
> union {
> u8 offsets8[size];
> be16 offsets16[size/2];
> be32 offsets32[size/4];
> }
> u8 data[];
>
> This represents a table of offsets for a range of integer key values
> beginning at start. Keys are processed as unsigned 32-bit values, but
> can represent a signed range crossing 0 as needed. Offsets may be
> encoded as unsigned 8-, 16-, or 32-bit values.
Does scale tell us the size of the offsets? Which values mean what?
> [...]
>
> If shift is nonzero, the offset obtained at index (key-start)>>shift
> in the offsets array leads to a subtable of the same form that will
> take the remainder (key-state)&((1<<shift)-1) as its input; [...]
Is this a typo and you wanted to say (key-start)&((1<<shift)-1)? Or is
state something else?
> [...]
>
> Path layout:
>
> localeconv/-1: binary data for the char fields of struct lconv, in the
> order they appear in the ISO C specification and in musl locale.h.
> [...]
>
> localeconv/0..9: string data for the first 10 fields of struct lconv,
> likewise in the order they appear in the specification and in musl.
> Items 2 and 7 consist of a pair of strings separated by a null
> terminator byte, [...]
The order of the fields of struct lconv in musl's locale.h doesn't
match any of the C standard drafts I checked (C99, C11, C17, C23).
It does however match the order in POSIX.1-2024 XBD 7.3.4 LC_NUMERIC
[1] and XBD 7.3.3.1 LC_MONETARY Category in the POSIX Locale [2].
It doesn't match the order in XBD 7.3.3 LC_MONETARY [3] / XSH 3
localeconv() [4] (int_n_cs_precedes is in a different position) or XBD
14 <locale.h> [5] (alphabetical order).
> [...]
>
> Examples of data encoding:
>
> langinfo/LC_TIME:
>
> start = 131072
> shift = 0
> scale = 0
> size = .....
> offsets8[] = {
> 1, 5, 9, 13, 17, 21, 25, 29, 36, ...
> }
> data[] = "Sun\0Mon\0Tue\0Wed\0Thu\0Fri\0Sat\0Sunday\0..."
>
> errors/1:
These look like strerror strings, so this should be errors/0, right?
> start = -1
> shift = 0
> scale = 0
> size = .....
> offsets16[] = {
> 1, 15, 36, 60, ...
> }
> data[] = "Unknown error\0"
> "No error information\0"
> "Operation not permitted\0"
> "No such file or directory\0"
> ...
Would both of the examples use scale = 0? Assuming scale says
something about the size of the offsets, they should differ.
[1] https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/basedefs/V1_chap07.html#tag_07_03_04
[2] https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/basedefs/V1_chap07.html#tag_07_03_03_01
[3] https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/basedefs/V1_chap07.html#tag_07_03_03
[4] https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/functions/localeconv.html
[5] https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/basedefs/locale.h.html
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.