Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 25 Nov 2020 08:40:02 +0300
From: Alexey Izbyshev <izbyshev@...ras.ru>
To: musl@...ts.openwall.com
Subject: Re: realpath without procfs -- should be ready for inclusion

On 2020-11-24 23:31, Rich Felker wrote:
> On Mon, Nov 23, 2020 at 11:26:46PM -0500, Rich Felker wrote:
>> On Tue, Nov 24, 2020 at 06:39:59AM +0300, Alexey Izbyshev wrote:
>> > * ENOTDIR should be returned if the last component is not a
>> > directory  and the path has one or more trailing slashes
>> 
>> Yes, that's precisely what I've been working on the past couple hours.
>> I think you missed but .. will also erase a path component that's not
>> a dir (e.g. /dev/null/.. -> /dev) and these are both instances of a
>> common problem. I thought use of readlink covered all the ENOTDIR
>> cases but it doesn't when the next component isn't covered by readlink
>> or isn't present at all.
>> 
>> It's trivial to fix with a check after each component but that doubles
>> the number of syscalls and mostly isn't necessary. I have a reworked
>> draft to fix the problem by advancing over /(/|./|.$)* rather than 
>> just
>> /+ after each component, so that we can lookahead and do an extra
>> readlink in the cases that need it.
> 
> While this worked, it ended up being the wrong thing to do, making two
> places where readlink is called, one of them with a dummy buffer. The
> right way to do it is rework the flow so that the existing readlink is
> "naturally" hit where needed. This amounts to:
> 
> - Letting .. processing that cancels path components go through the
>   same code path as new path components, rather than handling it
>   early, and just skipping the actual readlink if we already know we
>   have a dir.
> 
> - Also treating a zero-length final component as something that goes
>   through the readlink code path.
> 
> There was a fair amount of reorganizing needed to make this work out,
> but the end result is clean and non-redundant and code size is almost
> the same as before with the missing-ENOTDIR bugs.
> 
> Speaking of code size, on 32-bit archs the proposed explicit realpath
> is roughly the same size as stat+fstat+fstatat (a little over 1k on
> i386), which were needed to implement the old lazy realpath in terms
> of procfs. So for minimal static linking, resulting code size may be
> same or smaller. (Of course it's larger if stat is already linked for
> other reasons.)
> 
> New draft attached. It's possible that there are regressions since I
> haven't put together an automated testset. I'm not sure if I'll try to
> merge it in this release cycle still or not; that probably depends on
> how easy or difficult automating these tests ends up being.
> 
The new draft looks good to me. I've also done some basic manual testing 
(not covering all proposed cases) and haven't found any issues.

I don't see why the size of stack has to be PATH_MAX+1 though. To 
address the issue with symlink targets of PATH_MAX-1 length, it seems 
sufficient to just do the following:

-               ssize_t k = readlink(output, stack, p);
-               if (k==p) goto toolong;
+               ssize_t k = readlink(output, stack, p+1);
+               if (k==p+1) goto toolong;

Since p is never past the end of the stack, there is no harm in allowing 
k == p.

Alexey

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.