Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 27 Jun 2013 21:48:15 -0400
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Re: Use of size_t and ssize_t in mseek

On Fri, Jun 28, 2013 at 11:34:23AM +1000, Matthew Fernandez wrote:
> On 28/06/13 11:22, Rich Felker wrote:
> >On Fri, Jun 28, 2013 at 10:49:41AM +1000, Matthew Fernandez wrote:
> >>>As a user of musl, what's your take on this?
> >>
> >>A check in fmemopen (and other affected functions) would be my preferred
> >>solution, as an unwitting user like myself who doesn't check all the
> >>assumptions would still be caught out by just documenting it as
> >>undefined. I would be happy with just an assert-fail here if that's easiest..
> >
> >The easiest might just be making fmemopen so it doesn't care if the
> >size is insanely large. As far as I can tell, the only place it's an
> >issue is in mseek, and we could use off_t instead of ssize_t. On
> >32-bit systems, off_t is 64-bit, so all sizes fit. On 64-bit systems,
> >there's no way (physically!) to have an object as large as 1UL<<63.
> 
> I suppose this is an option, but this just isolates the problem to
> 64-bit systems.

Well, on a 32-bit system (albeit not the one stock musl, as intended
to run on Linux, implements) it's possible to have an object larger
than 2^31 bytes. It's never possible to have an object larger than
2^63 bytes on any system.

> On x86_64 I would still be able to naïvely call fmemopen
> with SIZE_MAX and end up being unable to fseek. Not being able to

Ah, I missed the fact that you were just passing SIZE_MAX because you
don't know the size; I thought you were passing some large value equal
to the actual size. In general, passing SIZE_MAX for unknown sizes is
dangerous, even if objects larger than SSIZE_MAX are supported,
because the implementation of the function might work like:

    char *end = base + size;
    while (pos < end) ...

In this case, merely adding base+size invoked UB if size was larger
than the size of the object; in practice, what happens is that the
pointer wraps, so end is less than base, and then the loop never runs.

While I'm not opposed to making changes to reject or to support
objects larger than SSIZE_MAX, I am pretty much opposed to making any
attempt to accept a length of SIZE_MAX as "unknown/unlimited size".
This kind of usage, as described above, is flawed and error-prone. If
you want to replace it with something at least halfway portable,
instead of passing SIZE_MAX, pass SIZE_MAX-(size_t)base or some
variant of this concept. This will at least avoid the overflow; it's
the trick musl uses internally for implementing sprintf in terms of
snprintf. However, I would really recommend trying to find some better
approach to obtain and pass the correct size to fmemopen. There is no
contract that fmemopen is not allowed to read arbitrary parts of the
buffer passed to it, so passing an incorrect length and expecting it
to work is depending on the current implementation.

By the way, note that the issue here is not that the size argument is
greater than SSIZE_MAX. Even if you passed a shorter "huge object"
length, like SSIZE_MAX or SSIZE_MAX/2, it could still cause overflows
if the base of the object happened to begin at a high address. The
issue is merely that the size is not the correct size of the object.

> >Alternatively, I could adjust the arithmetic to just avoid working
> >with signed values, and perhaps make it more obvious what it's doing
> >in the process.
> 
> I would also be happy with this solution. The code in mseek could
> definitely be clearer. Not that I don't enjoy switch statements written
> as offsets into stack structs and reverse jumps ;)

Yes, I think this is probably the best solution, even if it makes the
function a few bytes larger. The code should be more clear.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.