Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <af-GWTs9GRMlOlt1@mail.gmail.com>
Date: Sat, 9 May 2026 21:09:13 +0200
From: Luca Kellermann <mailto.luca.kellermann@...il.com>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: musl multi-level table format for binary locale images

On Fri, May 08, 2026 at 11:22:28PM -0400, Rich Felker wrote:
> [...]
> 
> Table structure;
> 
> 	be32 start;
> 	u8 shift;
> 	u8 scale;
> 	be16 size;
> 	union {
> 		u8 offsets8[size];
> 		be16 offsets16[size/2];
> 		be32 offsets32[size/4];
> 	}
> 	u8 data[];
> 
> This represents a table of offsets for a range of integer key values
> beginning at start. Keys are processed as unsigned 32-bit values, but
> can represent a signed range crossing 0 as needed. Offsets may be
> encoded as unsigned 8-, 16-, or 32-bit values.

Does scale tell us the size of the offsets? Which values mean what?

> [...]
> 
> If shift is nonzero, the offset obtained at index (key-start)>>shift
> in the offsets array leads to a subtable of the same form that will
> take the remainder (key-state)&((1<<shift)-1) as its input; [...]

Is this a typo and you wanted to say (key-start)&((1<<shift)-1)? Or is
state something else?

> [...]
> 
> Path layout:
> 
> localeconv/-1: binary data for the char fields of struct lconv, in the
> order they appear in the ISO C specification and in musl locale.h.
> [...]
> 
> localeconv/0..9: string data for the first 10 fields of struct lconv,
> likewise in the order they appear in the specification and in musl.
> Items 2 and 7 consist of a pair of strings separated by a null
> terminator byte, [...]

The order of the fields of struct lconv in musl's locale.h doesn't
match any of the C standard drafts I checked (C99, C11, C17, C23).

It does however match the order in POSIX.1-2024 XBD 7.3.4 LC_NUMERIC
[1] and XBD 7.3.3.1 LC_MONETARY Category in the POSIX Locale [2].

It doesn't match the order in XBD 7.3.3 LC_MONETARY [3] / XSH 3
localeconv() [4] (int_n_cs_precedes is in a different position) or XBD
14 <locale.h> [5] (alphabetical order).

> [...]
> 
> Examples of data encoding:
> 
> langinfo/LC_TIME:
> 
> 	start = 131072
> 	shift = 0
> 	scale = 0
> 	size = .....
> 	offsets8[] = {
> 		1, 5, 9, 13, 17, 21, 25, 29, 36, ...
> 	}
> 	data[] = "Sun\0Mon\0Tue\0Wed\0Thu\0Fri\0Sat\0Sunday\0..."
> 
> errors/1:

These look like strerror strings, so this should be errors/0, right?

> 	start = -1
> 	shift = 0
> 	scale = 0
> 	size = .....
> 	offsets16[] = {
> 		1, 15, 36, 60, ...
> 	}
> 	data[] = "Unknown error\0"
> 	         "No error information\0"
> 	         "Operation not permitted\0"
> 	         "No such file or directory\0"
> 	         ...

Would both of the examples use scale = 0? Assuming scale says
something about the size of the offsets, they should differ.

[1] https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/basedefs/V1_chap07.html#tag_07_03_04
[2] https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/basedefs/V1_chap07.html#tag_07_03_03_01
[3] https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/basedefs/V1_chap07.html#tag_07_03_03
[4] https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/functions/localeconv.html
[5] https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/basedefs/locale.h.html

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.