musl - Re: [PATCH] Byte-based C locale, draft 1

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150607011738.GB17573@brightrain.aerifal.cx>
Date: Sat, 6 Jun 2015 21:17:38 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] Byte-based C locale, draft 1

On Sat, Jun 06, 2015 at 05:40:07PM -0400, Rich Felker wrote:
> Before applying this I should probably overhaul fnmatch.c again. I
> believe it has some hard-coded UTF-8 processing code in it for the
> useless "check the tail before middle" step that I've been wanting to
> eliminate. Alternatively I could just apply a quick fix to make it
> work right without any invasive changes.
> 
> Other than possible weird cases with fnmatch (which are largely
> harmless but might inhibit matching high bytes in non-UTF-8 mode),
> this code should be ready for testing. I'd appreciate some feedback
> from anyone interested in the feature.

On further review, the special last-component handling fnmatch does is
not wrong, just wrongly ordered. It should take place after the "sea
of stars" component is processsed, rather than before, to avoid O(n)
operation (essentially strlen) when an early failure could be
detected. But since only the ordering is wrong, I think fixing it is
orthogonal to the bytelocale work, and a single-line patch to add a
case for MB_CUR_MAX==1 should just be added to this proposed patch
(see attached).

Rich

View attachment "bytelocale_v1_fnmatch.diff" of type "text/plain" (718 bytes)

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.