musl - Re: Stdio resource usage

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190220104901.GU21289@port70.net>
Date: Wed, 20 Feb 2019 11:49:01 +0100
From: Szabolcs Nagy <nsz@...t70.net>
To: musl@...ts.openwall.com
Subject: Re: Stdio resource usage

* Rich Felker <dalias@...c.org> [2019-02-19 21:43:13 -0500]:
> On Tue, Feb 19, 2019 at 03:34:52PM -0800, Nick Bray wrote:
> > I don't expect any of these modifications to make it
> > upstream.  Talking out loud as a FYI / user feedback.  Also curious to see
> > if there's any wisdom out there.
> > 
> > Stack usage of stdio was an issue.  On arm64, printf takes 8k of stack
> > which is a rough when you only have 4-12k of stack.  This is because fmt_fp
> > allocates stack space proportional O(log(MAX_LONG_DOUBLE)).  It also gets
> > inlined into printf so you always take the hit.  (noinline fmt_fp is a
> 
> This is a known compiler flaw, hoisting large stack allocations, and
> one I've complained a lot about but with little luck. It might be
> possible to work around it by making the array a VLA, whose size is 1
> or the proper size depending on some condition the compiler can't
> easily see, but that's rather awful. It might be worth doing though,
> given the lack of progress fixing the bug.

i think it's just an llvm issue, or does this happen with gcc too now?

> > Faustian bargain that makes stack usage worse in the worst case... hmmm.)
> > On arm64, long double is defined as 128 bits, which not only increases
> > stack size because of the larger mantisa, but also pulls in software

note: the mantissa is not the real issue, the exponent range is.
(e.g. to printf 0x1p-16494L you need to compute 5^16494/10^16494
which is floor(log10(5)*16494) + 1 = 11529 digits)

> > emulation for fp128.  In terms of spec compliance, Musl is doing the right
> > thing.  But as a practical matter, none of the programs I care about will
> > ever use long double.  So my rough first pass was to reduce the max float
> > size from long double to double.  In a later pass, I'll also add a knob to
> > remove floating point formatting entirely.
> 
> It's kinda unfortunate that aarch64 defined long double as IEEE quad
> without hardware implementation of it, but it's probably the right
> future-facing choice. I was under the impression that aarch64 was
> intended mostly for "large" systems, and that you'd use 32-bit arm
> (with much smaller code due to thumb) for tiny space-constrained
> systems, though.

aarch64 has 128 bit fp regs, so in principle future arch extension
may add 128bit instructions without breaking abi. (which may happen
if aarch64 gets adoption in supercomputers, e.g. powerpc64 did that)

> > %m calls strerror which pulls in a string table, so removing support for %m
> > lets static linking and DCE work its magic.
> 
> Yes. Note that %m is needed for a confirming syslog(), which was the
> motivation for supporting it in printf.
> 
> > I also eliminated %n for
> > security hardening reasons.
> 
> This actually introduces security bugs by breaking the contract. At
> some point I believe there may even have been some parts of musl you
> would have broken in dangerous ways, though I'm not sure if that's the
> case now. If you have a situation where the format string is
> non-constant, that, not %n, is the problem.

i think %n is not a huge loss, but it does sound like
repeating the bionic mistakes.  (providing posix symbols
with slightly not posix conform semantics because of
speculative resons which turned out to be a lot more
expensive to fix up than just following the standard)
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.