Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 16 Jan 2013 11:57:01 -0500
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Re: REG_STARTEND (regex)

On Wed, Jan 16, 2013 at 09:42:01AM -0600, Rob Landley wrote:
> On 01/15/2013 12:45:13 PM, Rich Felker wrote:
> >> Does anyone have suggestions on how this can be modified to be
> >able to
> >> use it with musl.
> >
> >If the start position is 0, which it seems to be here, there's nothing
> >to be done but removing REG_STARTEND. All it's doing is allowing you
> >to process data with embedded nul bytes, which is not required by the
> >standard or useful for any meaningful use of sed.
> 
> Actually people use sed to modify embedded strings in binaries.
> (Strange but true.)
> 
> >Nobody will notice
> >the difference with it missing unless they're trying to perform
> >hideous hacks like patching binary files with sed...
> 
> Which people do.
> 
> However, mostly this involves embedded nuls in the data being
> processed, not embedded nuls in the pattern space. So it's merely
> creepy rather than outright pathological. And the caller can wrap
> the regex library to do its own strlen stuff and restart right after
> the embedded NUL if there's data left. (Which was on the todo list
> for busybox sed back when Bruce happened, possibly Denys has
> implemented it since.)

If sed wants to support this without providing its own
embedded-NUL-capable regex library, it should just treat NUL as a kind
of boundary/line-break so that the pattern space never ends up
containing NUL bytes. However, there are still a good many other
portability issues with passing binary files to sed, even if you
ignore the fact that POSIX sed specifically requires a text file as
input, so I think it's rather misguided to cater to these uses anyway.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.