Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <B7E7DF17-45AB-4858-BEE2-BE0B0A1BF82B@aevum.de>
Date: Sun, 29 Jun 2025 18:25:15 +0200
From: Nick Wellnhofer <wellnhofer@...um.de>
To: musl@...ts.openwall.com
Subject: Re: Paintcans for reverse iterating strings



> On Jun 29, 2025, at 18:03, Rich Felker <dalias@...c.org> wrote:
> 
> On Sun, Jun 29, 2025 at 05:39:14PM +0200, Nick Wellnhofer wrote:
>> On Jun 29, 2025, at 02:08, Rich Felker <dalias@...c.org> wrote:
>>> One thing we're going to need for LC_COLLATE in locales where
>>> second-level weights are applied in reverse order (diacritic marks
>>> later in the string weigh more than earlier ones) is the ability to
>>> traverse (& live transform to NFD) the input string in reverse.
>> 
>> Assuming the context is strcoll and we're comparing two strings,
>> wouldn't it be possible to compare the strings in normal, forward
>> direction but instead of stopping at the first difference, comparing
>> all collation elements and returning the last difference (if any)?
> 
> I believe you can do something like this for strcoll. Note that,
> normally, you don't even get to second level weights when using
> strcoll.
> 
> Where you can't do it is strxfrm (transforming into a byte sequence
> that can be byte-by-byte compared).

For strxfrm, I assume you have to perform Step 3 of the UCA main algorithm or something equivalent: https://www.unicode.org/reports/tr10/#Step_3

So you just have to reverse the second-level weights afterwards which seems trivial. Am I missing something?

Nick

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.