Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Wed, 12 Sep 2018 10:02:39 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: string-backed FILEs mess

While working on the headers/declarations/hidden stuff, I've run
across some issues that should be documented as needing fix. One big
one is this:

strto* is still setting up a fake FILE with buffer end pointer as
(void*)-1 as a hack to read directly from a string of unknown length.
This is of course nonsensical UB and might lead to actual problems
(signed overflow) in places where rend-rpos is computed. strtol.c
handles that aspect already by "if ((size_t)s > (size_t)-1/2)" but
it's still horribly wrong. There's a __string_read function sscanf
uses that avoids the whole problem by incrementally measuring string
length and copying it to a real stdio buffer, which also makes ungetc
safe, and this is the obvious *right* thing to do, but it might make
strto* unacceptably slower. I haven't done any measurement.

The other "mostly right" (modulo ungetc not being available then)
approach would be getting rid of the whole current buffer design with
start/end pointers and using indices instead. This would solve a lot
of stupid gratuitous UB in stdio, like (T*)0-(T*)0 being undefined.
It's not clear to me whether it would be more or less efficient. It
would "break" glibc ABI-compat for stdio -- the original reason I used
the pointer-based design -- but that could be fixed by putting
"must-be-null" fields in place of the buffer pointers so that any
glibc code using getc_unlocked/putc_unlocked macros would hit the
"no buffer space" code path and call an actual function. In many ways
that's desirable anyway.

Probably the right next step here is measuring whether just using
__string_read would make anything measurably slower.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.