Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 20 Feb 2019 16:47:40 +0100
From: Markus Wichmann <nullplan@....net>
To: musl@...ts.openwall.com
Subject: Re: Stdio resource usage

On Wed, Feb 20, 2019 at 11:49:01AM +0100, Szabolcs Nagy wrote:
> * Rich Felker <dalias@...c.org> [2019-02-19 21:43:13 -0500]:
> > This is a known compiler flaw, hoisting large stack allocations, and
> > one I've complained a lot about but with little luck. It might be
> > possible to work around it by making the array a VLA, whose size is 1
> > or the proper size depending on some condition the compiler can't
> > easily see, but that's rather awful. It might be worth doing though,
> > given the lack of progress fixing the bug.
> 
> i think it's just an llvm issue, or does this happen with gcc too now?
> 

Take me like a data point: On x86_64, with gcc 8.2.0, and -Os, fmt_fp() is
not inlined into printf_core(). And it alone takes a whopping 7496 bytes
of stack (printf_core() only takes 168).

Compiling with -O2 also does not inline fmt_fp(), and it takes 7480
bytes. So somehow -O2 manages to save sixteen bytes of stack.

If I play for all the marbles and use -O3, it still doesn't inline
fmt_fp(), and now it needs 7752 bytes of stack. So now it needs 256
bytes more than originally.

It appears as though at least gcc 8 is no longer as inline happy as it
once was.

> 
> aarch64 has 128 bit fp regs, so in principle future arch extension
> may add 128bit instructions without breaking abi. (which may happen
> if aarch64 gets adoption in supercomputers, e.g. powerpc64 did that)
> 

Say, if IEEE quad is causing problems, wouldn't it be possible to
compile a tool chain with long double == double for the time being?

Ciao,
Markus

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.