Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 19 Mar 2016 21:11:16 +0300 (MSK)
From: Alexander Monakov <amonakov@...ras.ru>
To: musl@...ts.openwall.com
Subject: Re: [PATCH 1/3] overhaul environment functions

On Sat, 19 Mar 2016, Rich Felker wrote:
> > putenv:
> > * handle "=value" input via unsetenv too (will return -1/EINVAL);
> 
> What happens now? It adds an env var with zero-length name?

Returns -1 without modifying errno.

> > Not changed:
> > Failure to extend allocation tracking array (previously __env_map, now
> > env_alloced) is ignored rather than causing to report -1/ENOMEM to the
> > caller; the worst-case consequence is leaking this allocation when it
> > is removed or replaced in a subsequent environment access.
> 
> I'd like to fix this too but it can wait. I think it should be trivial
> to do as a small patch on top.

I thought about that, but decided to avoid doing that in the first cut
(__env_change return type changes to 'int', dummy definitions need to be
adjusted accordingly, and oom handling in __env_change is not beautiful
either)

> > --- a/src/env/putenv.c
> > +++ b/src/env/putenv.c
> > @@ -1,58 +1,47 @@
> > +#include <stdlib.h>
> > +#include <string.h>
> > +#include "libc.h"
> > +
> > +char *__strchrnul(const char *, int);
> > +
> > +static void dummy(char *p, char *r) {}
> > +weak_alias(dummy, __env_change);
> > +
> > +int __putenv(char *s, size_t l, char *r)
> > +{
> > +	size_t i=0;
> > +	if (__environ)
> > +		for (; __environ[i]; i++)
> > +			if (!strncmp(__environ[i], s, l+1)) {
> > +				char *tmp = __environ[i];
> > +				__environ[i] = s;
> > +				__env_change(tmp, r);
> > +				return 0;
> > +			}
> 
> As far as I can tell, this leaves multiple definitions in place. Am I
> missing something? Maybe it's only unset and not replacement that's
> safe against multiple-definition madness?

Removing multiple definitions is what patch 2/3 does. This patch just
keeps the status quo (only unsetenv looks at duplicate definitions).

> 
> > +	static char **oldenv;
> > +	char **newenv;
> > +	if (__environ == oldenv) {
> > +		newenv = realloc(oldenv, sizeof *newenv * (i+2));
> > +		if (!newenv) goto oom;
> > +	} else {
> > +		newenv = malloc(sizeof *newenv * (i+2));
> > +		if (!newenv) goto oom;
> > +		if (i) memcpy(newenv, __environ, sizeof *newenv * i);
> > +		free(oldenv);
> > +	}
> 
> Rather than using malloc when __environ != oldenv, I think we should
> use realloc on oldenv, so that we don't leak internally-allocated
> environ arrays if the program repeatedly does environ=0 or calls your
> new clearenv.

How can we leak internally allocated environ here? If there's one, oldenv
points to it, and we free it right after memcpy.

I think realloc can be used if the program does not modify environ, but if
it does something funky like 'environ++[2] = 0;' then memcpy'ing after realloc
is not safe (unlike doing malloc-memcpy-free as above).

> Perhaps we should also store the allocated size and grow
> it exponentially rather than assuming it's only the filled size and
> calling realloc every time, but maybe it's actually better to resize
> up/down. Do you have an opinion?

Some fraction of times realloc will keep it in place due to binning, right?
I think taking 'simplicity' in efficiency-simplicity tradeoff here is
justified, on the basis that majority of software does not repeatedly change
environment, and for shells this libc facility is of little help anyway.

> >  int setenv(const char *var, const char *value, int overwrite)
> >  {
> >  	char *s;
> > -	int l1, l2;
> > -
> > -	if (!var || !*var || strchr(var, '=')) {
> > +	size_t l1 = __strchrnul(var, '=') - var, l2;
> > +	if (!l1 || var[l1]) {
> >  		errno = EINVAL;
> >  		return -1;
> >  	}
> 
> As mentioned above we should probably keep the POSIX-old behavior of
> accepting a null pointer and treating it as a diagnosed error.

OK; in an old revision I had 'if (!var || !(l1 = __strchrnul(var, '=')) ...'

> >  	if (!overwrite && getenv(var)) return 0;
> >  
> > -	l1 = strlen(var);
> >  	l2 = strlen(value);
> >  	s = malloc(l1+l2+2);
> > -	if (s) {
> > -		memcpy(s, var, l1);
> > -		s[l1] = '=';
> > -		memcpy(s+l1+1, value, l2);
> > -		s[l1+l2+1] = 0;
> > -		if (!__putenv(s, 1)) return 0;
> > -	}
> > -	free(s);
> > -	return -1;
> > +	if (!s) return -1;
> > +	memcpy(s, var, l1);
> > +	s[l1] = '=';
> > +	memcpy(s+l1+1, value, l2+1);
> > +	return __putenv(s, l1, s);
> >  }
> 
> This leaks when __putenv fails. It's no worse than before but should
> probably be fixed.

Only when __env_change, specifically, "fails"; not when __putenv OOMs.

> > +#include <stdlib.h>
> > +#include <string.h>
> > +#include <errno.h>
> > +#include "libc.h"
> > +
> > +char *__strchrnul(const char *, int);
> > +
> > +static void dummy(char *p, char *r) {}
> > +weak_alias(dummy, __env_change);
> > +
> > +int unsetenv(const char *name)
> > +{
> > +	size_t l = __strchrnul(name, '=') - name;
> > +	if (!l || name[l]) {
> > +		errno = EINVAL;
> > +		return -1;
> > +	}
> > +	if (!__environ) return 0;
> > +	for (char **e = __environ; *e; e++)
> > +		while (*e && !strncmp(name, *e, l) && l[*e] == '=') {
> > +			char **ee = e, *tmp = *e;
> > +			do *ee = *(ee+1);
> > +			while (*++ee);
> > +			__env_change(tmp, 0);
> > +		}
> > +	return 0;
> > +}
> 
> I think this looks ok.

I'd like to add a minor tweak here: instead of retesting '*e' in 'while' loop
header, do 'if (!*e) return 0;' after __env_change; this helps the compiler to
generate slightly cleaner code.

Thanks for the review!
Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.