Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 23 Feb 2015 17:01:36 +0300
From: Alexander Cherepanov <>
Subject: Re: Re: CVE Request: cabextract -- directory traversal

On 2015-02-23 10:38, wrote:
>> it removes leading slashes from filenames but does it before possibly
>> decoding UTF-8 and doesn't check for invalid UTF-8
>> The issue was reported to Stuart Caie today and fixed in less than 4h:
> Your report seems to be about the need for the "/* remove leading
> slashes */" code to occur after (not before) the "/* get next UTF-8
> character */" code.

That's right (more or less, see below).

> Is this the only vulnerability being reported, or


> is the stated behavior of "This doesn't reject bad UTF-8 with overlong
> encodings, but does re-encode it as valid UTF-8" an independent
> vulnerability?

Not that I know of.

More precisely, I see two ways to fix the problem with overlong encoding 
of leading slashes:
- by checking for slashes after decoding;
- by checking for ordinary slashes before decoding and prohibiting 
overlong encodings.

My report didn't imply any particular fix. The first way seems to be 
easier and it was chosen by Stuart. I don't readily see any other 
problem with overlong encodings.

>> /* special case if there's only one file - just take the first slash */
>> if (c == '\\') return 0; /* backslash = MS-DOS */

The code quoted above is from the unix_path_seperators function which 
determines the type of slashes used in the archive.

>> isunix = unix_path_seperators(cab->files);
>> sep   = (isunix) ? '/'  : '\\'; /* the path-seperator */

This line is from the create_output_name function. The next line:

   slash = (isunix) ? '\\' : '/';  /* the other slash */

>>   while (*fname == sep) fname++;

This is from non-utf8 code path. It skips leading "sep"s, then all 
"sep"s are converted to '/' and all "slash"es are converted to '\\':

     do {
       c = *fname++;
       if      (c == sep)   c = '/';
       else if (c == slash) c = '\\';
       else if (lower)      c = (unsigned char) tolower((int) c);
     } while ((*p++ = c));

The utf8 code path is similar.

> What happens if the .cab archive contains only one file, and \/tmp/abs
> is the filename?

In this case, the unix_path_seperators function will determine that 
MS-DOS path separators are used. Hence sep == '\\' && slash == '/' in 
the create_output_name function. Then the leading '\' is skipped and all 
'/' are converted to '\\'.


$ touch xxxxxxxxx
$ lcab xxxxxxxxx > /dev/null
$ sed -i 's|xxxxxxxxx|\\/tmp/abs|g'
$ rm xxxxxxxxx

$ ls *abs
ls: cannot access *abs: No such file or directory

$ ./cabextract
Extracting cabinet:
   extracting \tmp\abs

All done, no errors.

$ ls *abs

It's important that all checks of input filename are against "sep" and 
"slash" and all checks of output filename (e.g. for "../") are against 
explicit '/' and '\\' but the code seems to be accurate in this regard.

Another thing which gives me uneasy feeling is the use of tolower:
- if there is a locale where you can get '/' with tolower from another 
character this will lead to a dir traversal;
- passing an arbitrary unicode value to tolower (in utf8 code path) is 
at least an undefined behavior.
The former doesn't seem probable. The latter, I'll send it now to 
Stuart. If someone can see security implications of this it would be 
nice to hear it.

Alexander Cherepanov

Powered by blists - more mailing lists

Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.