Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 19 Jun 2017 15:15:23 -0600
From: Jeff Law <law@...hat.com>
To: oss-security@...ts.openwall.com, Daniel Micay <danielmicay@...il.com>
Subject: Re: Re: Qualys Security Advisor -- The Stack Clash

On 06/19/2017 12:52 PM, Daniel Micay wrote:
> On Mon, 2017-06-19 at 11:26 -0600, Jeff Law wrote:
>> I would consider those two GCC BZs (68065, 66479) a separate an
>> distinct
>> issue.
>>
>> It is far more important to address design issues around the existing
>> -fstack-check first.  I think we've got a pretty good handle on how to
>> address those problems and discussions with the upstream GCC community
>> have already started.
>>
>> In an ideal world we'll get to a place where the new -fstack-check
>> does
>> not change program semantics, never misses probes and is efficient
>> enough to just turn on and forget everywhere.  The existing
>> -fstack-check fails all three of those criteria.
>>
>> Jeff
> 
> AFAIK, the main efficiency issue (reserving a register) was fixed for
> GCC 6. I might be missing something but it seems very cheap now, at
> least for x86_64. It definitely doesn't really work though.
> 
> Is there an example of it changing program semantics? I haven't seen
> anything since the generic arch stuff was fixed.
Absolutely -fstack-check, as currently implemented, can change program
semantics.  It's related to -fstack-check moving objects from statically
allocated space into dynamically allocated space (because the generic
code can't handle large static frames).  It creates the alloca'd objects
at the wrong scope.  This doesn't happen on all architectures, but it
does happen on architectures I have to care about.

WRT efficiency, -fstack-check is marginal -- even with its clever code
of assuming that it can elide probes into the first two pages of a
static frame (because the caller must have probed those two frames).
Consistently when we looked at code it's over-probing.  Sadly, it's
over-probing in all the wrong places (because it's trying so damn hard
to ensure there's always 2 free pages the signal handler can use).

WRT probing correctness -- -fstack-check skips probes on the assumption
that an earlier caller in the call chain should have probed those pages.
 But that's a fundamentally flawed assumption unless the entire
application is compiled with -fstack-check.  In fact, by eliding those
probes, it actually misses the most important cases in mixed environment!

-fstack-check also has the nasty habit of probing into unallocated
areas.  This tends to cause valgrind problems.  Both in the sense of
getting far too many false positives, but on two platforms the code
generated by -fstack-check actually crashes valgrind.  These issues are
directly related to -fstack-check wanting to probe all the pages before
doing any allocations.

FWIW, we initially thought we were going to be able to use -fstack-check
with some slight tweaks.  But the deeper we got into -fstack-check the
more we ended up rewriting the probe generation from scratch.  In fact,
we were unable to use the existing -fstack-check probing code from *any*
target.

jeff

Powered by blists - more mailing lists

Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.