Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 18 Jan 2017 19:51:44 +0000
From: Bernardo Pascoal Figueiredo
 <bernardopascoalfigueiredo@...nico.ulisboa.pt>
To: musl@...ts.openwall.com
Subject: Re: Re: Implementation of GLOB_TILDE

On 2017-01-17 19:39, Rich Felker wrote:
> On Tue, Jan 17, 2017 at 03:49:03PM +0000, Bernardo Pascoal Figueiredo 
> wrote:
>> >I have a few questions though:
>> > * How do I contribute to musl? Should I just send patches to this
>> >mailing list
> 
> This is the preferred way, yes.
> 
>> > * I defined GLOB_TILDE as 0x100, but I think this won't work on
>> >architectures
>> >   that have sizeof(int) == 2, as the flags argument in glob is an int.
> 
> That's not an issue. (1) POSIX requires >=32bit int; musl is even more
> restrictive. (2) 0xff is 8 bits, not 16. Even ISO C requires 0x100 to
> fit in int. You do need to match whatever value glibc uses, though; I
> haven't checked that.

They use 1 << 12 which is 0x1000. I fixed this in the patch below.
Does this really matter? Is musl supposed to be binary compatible with 
glibc?
OpenBSD's libc uses 0x800 for example.

> 
>> > * I think it's best to define GLOB_TILDE in glob.h inside a '#if
>> >   defined(_GNU_SOURCE) || defined(_BSD_SOURCE)' what do you think?
>> >
>> > * I had to copy strlcat and strlcpy to glob.c so I could use
>> >them. I had to do
>> >   this because musl isn't compile as _GNU_SOURCE or _BSD_SOURCE
>> >so string.h
>> >   doesn't expose these functions. How should I fix this?
> 
> Just don't use them. The same thing can be achieved much better with
> portable functions strnlen and memcpy.
> 
> Note that even if you added #define for _GNU_SOURCE to this file, you
> couldn't reference these functions because then a function in the
> standard namespace would depend on symbols in nonstandard namespace.
> 

What I was thinking was to have private musl implementations of strlcat 
and
strlcpy and have the public strlcat and strlcpy call these private ones.
That way it'd be possible to use strlcat and strlcpy within musl.
I'd prefer that to strnlen and memcpy because that's what strlcat and 
strlcpy
already do, so I'd be duplicating it's functionality. This is way more 
error
prone.

>> diff --git a/src/regex/glob.c b/src/regex/glob.c
>> index 5b6ff124..f40da380 100644
>> --- a/src/regex/glob.c
>> +++ b/src/regex/glob.c
>> @@ -8,6 +8,9 @@
>>  #include <errno.h>
>>  #include <stddef.h>
>>  #include "libc.h"
>> +#include <stdbool.h>
> 
> Just use int for boolean values; bool is not idiomatic in musl.

Fixed

> 
>> +/*"~" or "~/(...)" case*/
>> +static bool expand_tilde_cur_user(const char *pat_after_tilde, char 
>> *new_pat, size_t new_pat_size)
>> +{
>> +    char *home;
>> +    struct passwd pw_store, *pw_result;
>> +    char pw_buf[1024];
>> +
>> +    /*FIXME: add check for issetugid as in libc of openbsd?*/
>> +    home = getenv("HOME");
>> +    if(home == NULL) {
>> +        getpwuid_r(getuid(), &pw_store, pw_buf, sizeof(pw_buf), 
>> &pw_result);
>> +        if(pw_result == NULL) {
>> +            return false;
>> +        }
>> +        home = pw_store.pw_dir;
>> +    }
>> +
>> +    return glob_strlcpy(new_pat, home, new_pat_size) < new_pat_size
>> +        && glob_strlcat(new_pat, pat_after_tilde, new_pat_size) < 
>> new_pat_size;
>> +}
>> +
>> +/* "~user/(...) case*/
>> +static bool expand_tilde_named_user(const char *pat_after_tilde, char 
>> *new_pat, size_t new_pat_size)
>> +{
>> +    struct passwd pw_store, *pw_result;
>> +    char pw_buf[1024], username[1024];
>> +    const char *slash_pos = strchr(pat_after_tilde, '/');
>> +    if(slash_pos == NULL) {
>> +        return false;
>> +    }
>> +
>> +    ptrdiff_t pat_username_size = slash_pos - pat_after_tilde;
>> +    if(pat_username_size <= 0 || pat_username_size >= 
>> sizeof(username)) {
>> +        return false;
>> +    }
>> +    strncpy(username, pat_after_tilde, pat_username_size);
>> +    username[pat_username_size] = '\0';
>> +
>> +    getpwnam_r(username, &pw_store, pw_buf, sizeof(pw_buf), 
>> &pw_result);
>> +    if (pw_result == NULL)
>> +        return false;
>> +
>> +    return glob_strlcpy(new_pat, pw_store.pw_dir, new_pat_size) < 
>> new_pat_size
>> +        && glob_strlcat(new_pat, slash_pos, new_pat_size) < 
>> new_pat_size;
>> +}
> 
> It should be possible to reduce this code considerably, and to avoid
> some of the large stack buffers. Certainly username[] does not need to
> be 1k; I believe there's a macro for the max supported in limits.h or
> such and it's something like 16 or 32 bytes.

I searched and found there is a define for LOGIN_NAME_MAX which is 256.
According to the useradd man page the maximum number of characters of an
username is 32.
There is also in limits.h _POSIX_NAME_MAX which is 14 and 
_POSIX_LOGIN_NAME_MAX
which is 9.
The openbsd pwd.h has a define for _PW_NAME_LEN which is 31 (without 
'\0').

Which alternative is the best?

(I updated the username arrays to be LOGIN_NAME_MAX)

Relative to the pw_buf I don't think musl's code has any variable that 
defines
the maximum supported passwd line.
I used 1024 because that's what I saw in the openbsd code. They have a 
global
char [] of size _PW_BUF_LEN which is 1024.

Reduce the code in which way? Try to refactor some code of
expand_tilde_cur_user and expand_tilde_named_user, of make the code 
"less
verbose"?

> 
>>  int glob(const char *restrict pat, int flags, int (*errfunc)(const 
>> char *path, int err), glob_t *restrict g)
>>  {
>> -    const char *p=pat, *d;
>> +    const char *p, *d;
>> +    char new_pat[PATH_MAX + 1];
>>      struct match head = { .next = NULL }, *tail = &head;
>>      size_t cnt, i;
>>      size_t offs = (flags & GLOB_DOOFFS) ? g->gl_offs : 0;
>>      int error = 0;
>> +
>> +    /*even if expanding fails(e.g. expansion make pat too big)
>> +     * we should try to match the ~ or ~user literally*/
>> +    bool should_expand_tilde = (flags & GLOB_TILDE) && (pat[0] == 
>> '~');
>> +    if(should_expand_tilde && expand_tilde(pat, new_pat, 
>> sizeof(new_pat))) {
>> +        p = new_pat;
>> +    } else {
>> +        p = pat;
>> +    }
> 
> Don't introduce gratuitous variables for expressions used just once.
> The conditions going into that bool can just be part of the if.
If I put it inside the if the line would be too big and those two checks 
go
together.
I think it's alot clearer.

> 
> I haven't checked what happens when the input pat is too long to fit
> in new_pat. This needs to be an error condition, not silent
> truncation, but error conditions can't be handled until further down;
> see commit 769f53598e781ffc89191520f3f8a93cb58db91f for why. Also,
> new_pat should probably be a VLA whose length is 1 unless tilde
> expansion is needed, and PATH_MAX otherwise, so that glob doesn't blow
> an extra page on the stack when tilde expansion is not in use.

I changed it to char new_pat[tilde_flag ? PATH_MAX : 1] where tilde_flag 
=
flags & GLOB_TILDE.
(new_pat no longer uses an extra page.)

Alternatively I can put most of the code of glob in another function and 
only
create a new pat buffer when we should try to expand the tilde.
This would be good to reduce the scope of new_pat.

When the expanded pattern is too big to fit new_pat, the expansion fails 
and
the unexpanded pattern is used.
According to glibc GLOB_TILDE should never make glob return an error. If
anything related to the expansion of tilde fails it's supposed to 
continue glob
with the pattern unexpanded.
At least that how I interpret the manual:
"
If the username is invalid, or the home directory cannot be determined, 
then no
substitution is performed.
"
What I'm trying to say is that GLOB_TILDE never produces an error, so 
the code
that zeros the glob_t structure always runs.

If the code that checks for leading '/' is put just before the call to
match_dir, it'd be ok to put the code that calls expand_tilde after the 
glob_t
zeroing and strlen checking and before the leading '/' code.
The "downside" of this is that the caller of glob has a ton of leading 
'/' it
could now fail the strlen check, but I think that's the fault of who is 
calling
glob with "/////////////////////(...)"
A patch for that is in attachment "move_leading_slash_check_down.patch".


> 
> Rich

I also send the patch with the things I fixed as attachment. It's still 
not okay.
View attachment "glob_tilde.patch" of type "text/x-diff" (4543 bytes)

View attachment "move_leading_slash_check_down.patch" of type "text/x-diff" (829 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.