Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Sat, 6 Dec 2014 14:22:26 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Cc: alpine-devel@...ts.alpinelinux.org
Subject: Re: getopt_long permutation algorithm questions

On Sat, Dec 06, 2014 at 12:47:10PM -0500, Rich Felker wrote:
> On Thu, Dec 04, 2014 at 08:25:06AM +0200, Timo Teras wrote:
> > Not sure if there's some nasty corner case surprises, but I'd start
> > with that as it's the simple approach.
> 
> There is one nasty corner case I don't know how to deal with. Consider
> an option string of "ab:" with the command line:
> 
> ../a.out foo -ab
> 
> The "desired" result is that -a is accepted successfully and -b yields
> an error due to missing argument. But permuting "-ab" before "foo"
> results in "foo" being treated as the argument to -b.
> 
> Fortunately this issue only matters for erroneous (syntactically
> invalid) command lines, so we can probably ignore it by just refusing
> to permute them (thus simply treating "foo" and "-ab" as non-option
> arguments).
> 
> Empirically (I haven't RTFS'd and don't intend to) glibc has an
> alternate approach where permutation happens after the argv[]
> component has been consumed rather than before. When it sees a
> non-option argument, it saves the location, jumps forward and
> processes the next option-containing argument, and only moves it back
> to the saved location when advancing optind. This exposes nonsensical
> values of optind to the application; after processing the above and
> getting the error for -b, optind points to the null pointer slot at
> the end of argv[], and only jumps back to point to the permuted foo
> when getopt[_long] is called _again_ after the error.
> 
> I haven't yet tested what the BSD version does.

The BSD code Alpine is using right now seems to do the same thing as
glibc does empirically, and the implementation matches what I expected
-- saving state and applying the permutation later. My leaning is
still to go with the method I proposed that avoids internal state and
inconsistent values of optind during parsing, at the expense of not
permuting syntactically invalid command lines. Any opinions?

BTW just a heads-up for Alpine: the BSD code that's being patched in
right now contains namespace violations and is causing code linked to
their modified libc to contain copy-reloc references to optreset
rather than __optreset, if it uses optreset. I think this will be safe
to fix (old binaries will continue to work once it's fixed), but it's
not forwards compatible. Binaries built with the correct symbol name
(again, only programs which use optreset) will not work on Alpine's
old libc.so, so upgrading libc will be mandatory to use new binaries.
This also applies to binaries build against musl on non-Alpine systems
and carried over to Alpine.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.