Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Thu, 21 Feb 2019 12:02:59 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Stdio resource usage

On Thu, Feb 21, 2019 at 05:09:37PM +0100, Markus Wichmann wrote:
> On Wed, Feb 20, 2019 at 02:24:23PM -0500, Rich Felker wrote:
> > For what it's worth, gcc has a -fconserve-stack that in principle
> > should avoid this problem, but I could never get it to do anything. If
> > it works now we should probably detect and add it to default CFLAGS.
> > 
> > Rich
> 
> Well, that also doesn't help since gcc is the compiler that *doesn't*
> exhibit the problem. clang does. And clang doesn't have an option to
> conserve stack (that I've seen).
> 
> I am wondering what other possibilities exist to prevent the issue. If
> we won't change the algorithm, that only leaves exploring other
> possibilities for the memory allocation.

There is no algorithm that takes less space, at not without some kind
of cubic-in-exponent-value or worse time. The amount of space we use
is optimal up to some small factor. It might be possible to shrink
this factor with a sharper bound on number of digits needed, with no
change in the algorihm, but I think the reduction would be at most
something like 20%.

> So, what are our choices?
> 
> - Heap allocation: But that can fail. Now, printf() is actually allowed
>   to fail, but no-one expects it to. I would expect such behavior to be
>   problematic at best.

printf can fail for valid reasons, but snprintf cannot. Technically
POSIX allows any interface that can fail to be able to fail for
additional implementation-defined reasons, but this is unacceptably
bad QoI and completely contrary to the principles of musl, that
nothing fails unless there's an underlying reason it has to be able to
fail.

> - Static allocation: Without synchronization this won't be thread-safe,
>   with synchronization it won't be re-entrant. Now, as far as I could
>   see, the printf() family is actually not required to be re-entrant
>   (e.g. signal-safety(7) fails to list any of them), but I have seen
>   sprintf() in signal handlers in the wild (well, exception handlers,
>   really).

If you can afford to increase .data size by ~8k, why can'd you just
increase stack size by ~8k instead? Of course the latter would scale
in number of threads, but presumably if you're this
resource-constrained you're not using threads, or can avoid using
printf from most of them.

> - Thread-local static allocation: Which is always a hassle in libc, and
>   does not take care of re-entrancy. It would only solve the
>   thread-safety issue.

This is strictly-worse than just using the stack. Implementation-wise,
the TLS is equivalent to a stack object on the top-level call frame of
the thread. There's no reason to put it there rather than in the
bottom-level call frame.

> - As-needed stack allocation (e.g. alloca()): This fails to prevent the
>   worst case allocation, though it would make the average allocation
>   more bearable. But I don't know if especially clever compilers like
>   clang wouldn't optimize this stuff away, and we'd be back to square
>   one.

This is what I already suggested (via VLA, not alloca, as the latter
is not C and worse in most ways) as a workaround for the clang
hoisting of allocations. But in principle the compiler could still see
that if the declaration is reachable the size is constant (or even
close enough to constant that it could just optimize to a fixed-size
array of the upper bound), and optimize out its being variable, then
hoist it. So this really is a hack that's "tricking the optimizer",
not any fundamental fix.

> Any ideas left?

Getting clang to fix their hoisting of (large) stack objects beyond
their scope/lifetime?

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.