musl - Re: [PATCH 1/3] overhaul environment functions

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LNX.2.20.1603192015120.10468@monopod.intra.ispras.ru>
Date: Sat, 19 Mar 2016 21:11:16 +0300 (MSK)
From: Alexander Monakov <amonakov@...ras.ru>
To: musl@...ts.openwall.com
Subject: Re: [PATCH 1/3] overhaul environment functions

On Sat, 19 Mar 2016, Rich Felker wrote:
> > putenv:
> > * handle "=value" input via unsetenv too (will return -1/EINVAL);
> 
> What happens now? It adds an env var with zero-length name?

Returns -1 without modifying errno.

> > Not changed:
> > Failure to extend allocation tracking array (previously __env_map, now
> > env_alloced) is ignored rather than causing to report -1/ENOMEM to the
> > caller; the worst-case consequence is leaking this allocation when it
> > is removed or replaced in a subsequent environment access.
> 
> I'd like to fix this too but it can wait. I think it should be trivial
> to do as a small patch on top.

I thought about that, but decided to avoid doing that in the first cut
(__env_change return type changes to 'int', dummy definitions need to be
adjusted accordingly, and oom handling in __env_change is not beautiful
either)

> > --- a/src/env/putenv.c
> > +++ b/src/env/putenv.c
> > @@ -1,58 +1,47 @@
> > +#include <stdlib.h>
> > +#include <string.h>
> > +#include "libc.h"
> > +
> > +char *__strchrnul(const char *, int);
> > +
> > +static void dummy(char *p, char *r) {}
> > +weak_alias(dummy, __env_change);
> > +
> > +int __putenv(char *s, size_t l, char *r)
> > +{
> > +	size_t i=0;
> > +	if (__environ)
> > +		for (; __environ[i]; i++)
> > +			if (!strncmp(__environ[i], s, l+1)) {
> > +				char *tmp = __environ[i];
> > +				__environ[i] = s;
> > +				__env_change(tmp, r);
> > +				return 0;
> > +			}
> 
> As far as I can tell, this leaves multiple definitions in place. Am I
> missing something? Maybe it's only unset and not replacement that's
> safe against multiple-definition madness?

Removing multiple definitions is what patch 2/3 does. This patch just
keeps the status quo (only unsetenv looks at duplicate definitions).

> 
> > +	static char **oldenv;
> > +	char **newenv;
> > +	if (__environ == oldenv) {
> > +		newenv = realloc(oldenv, sizeof *newenv * (i+2));
> > +		if (!newenv) goto oom;
> > +	} else {
> > +		newenv = malloc(sizeof *newenv * (i+2));
> > +		if (!newenv) goto oom;
> > +		if (i) memcpy(newenv, __environ, sizeof *newenv * i);
> > +		free(oldenv);
> > +	}
> 
> Rather than using malloc when __environ != oldenv, I think we should
> use realloc on oldenv, so that we don't leak internally-allocated
> environ arrays if the program repeatedly does environ=0 or calls your
> new clearenv.

How can we leak internally allocated environ here? If there's one, oldenv
points to it, and we free it right after memcpy.

I think realloc can be used if the program does not modify environ, but if
it does something funky like 'environ++[2] = 0;' then memcpy'ing after realloc
is not safe (unlike doing malloc-memcpy-free as above).

> Perhaps we should also store the allocated size and grow
> it exponentially rather than assuming it's only the filled size and
> calling realloc every time, but maybe it's actually better to resize
> up/down. Do you have an opinion?

Some fraction of times realloc will keep it in place due to binning, right?
I think taking 'simplicity' in efficiency-simplicity tradeoff here is
justified, on the basis that majority of software does not repeatedly change
environment, and for shells this libc facility is of little help anyway.

> >  int setenv(const char *var, const char *value, int overwrite)
> >  {
> >  	char *s;
> > -	int l1, l2;
> > -
> > -	if (!var || !*var || strchr(var, '=')) {
> > +	size_t l1 = __strchrnul(var, '=') - var, l2;
> > +	if (!l1 || var[l1]) {
> >  		errno = EINVAL;
> >  		return -1;
> >  	}
> 
> As mentioned above we should probably keep the POSIX-old behavior of
> accepting a null pointer and treating it as a diagnosed error.

OK; in an old revision I had 'if (!var || !(l1 = __strchrnul(var, '=')) ...'

> >  	if (!overwrite && getenv(var)) return 0;
> >  
> > -	l1 = strlen(var);
> >  	l2 = strlen(value);
> >  	s = malloc(l1+l2+2);
> > -	if (s) {
> > -		memcpy(s, var, l1);
> > -		s[l1] = '=';
> > -		memcpy(s+l1+1, value, l2);
> > -		s[l1+l2+1] = 0;
> > -		if (!__putenv(s, 1)) return 0;
> > -	}
> > -	free(s);
> > -	return -1;
> > +	if (!s) return -1;
> > +	memcpy(s, var, l1);
> > +	s[l1] = '=';
> > +	memcpy(s+l1+1, value, l2+1);
> > +	return __putenv(s, l1, s);
> >  }
> 
> This leaks when __putenv fails. It's no worse than before but should
> probably be fixed.

Only when __env_change, specifically, "fails"; not when __putenv OOMs.

> > +#include <stdlib.h>
> > +#include <string.h>
> > +#include <errno.h>
> > +#include "libc.h"
> > +
> > +char *__strchrnul(const char *, int);
> > +
> > +static void dummy(char *p, char *r) {}
> > +weak_alias(dummy, __env_change);
> > +
> > +int unsetenv(const char *name)
> > +{
> > +	size_t l = __strchrnul(name, '=') - name;
> > +	if (!l || name[l]) {
> > +		errno = EINVAL;
> > +		return -1;
> > +	}
> > +	if (!__environ) return 0;
> > +	for (char **e = __environ; *e; e++)
> > +		while (*e && !strncmp(name, *e, l) && l[*e] == '=') {
> > +			char **ee = e, *tmp = *e;
> > +			do *ee = *(ee+1);
> > +			while (*++ee);
> > +			__env_change(tmp, 0);
> > +		}
> > +	return 0;
> > +}
> 
> I think this looks ok.

I'd like to add a minor tweak here: instead of retesting '*e' in 'while' loop
header, do 'if (!*e) return 0;' after __env_change; this helps the compiler to
generate slightly cleaner code.

Thanks for the review!
Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.