musl - Re: [PATCH] handle ^ and $ in BRE subexpression start and end as anchors

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20161124144635.GU1555@brightrain.aerifal.cx>
Date: Thu, 24 Nov 2016 09:46:35 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] handle ^ and $ in BRE subexpression start and end
 as anchors

On Thu, Nov 24, 2016 at 01:44:49AM +0100, Szabolcs Nagy wrote:
> In BRE, ^ is an anchor at the beginning of an expression, optionally
> it may be an anchor at the beginning of a subexpression and must be
> treated as a literal otherwise.
> 
> Previously musl treated ^ in subexpressions as literal, but at least
> glibc and gnu sed treats it as an anchor and that's the more useful
> behaviour: it can always be escaped to get back the literal meaning.
> 
> Same for $ at the end of a subexpression.
> 
> Portable BRE should not rely on this, but there are sed commands in
> build scripts which do.
> 
> This changes the meaning of the BREs:
> 
> 	\(^a\)
> 	\(a\|^b\)
> 	\(a$\)
> 	\(a$\|b\)
> ---
> bit hackish solution, but turns out ctx->re was not used for anything
> else than to detect if ^ was at the start of a full bre, changed that
> to start of a subexpr now.

The renaming of the member from re to start is to prove that there are
no other users that get broken by this? If so, I like that.

Rich

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.