Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 01 Mar 2018 10:33:05 +0000
From: Richard Sandiford <richard.sandiford@....com>
To: Alexander Popov <alex.popov@...ux.com>
Cc: Will Deacon <will.deacon@....com>,  Laura Abbott <labbott@...hat.com>,  Kees Cook <keescook@...omium.org>,  Mark Rutland <mark.rutland@....com>,  Ard Biesheuvel <ard.biesheuvel@...aro.org>,  <linux-kernel@...r.kernel.org>,  <linux-arm-kernel@...ts.infradead.org>,  <kernel-hardening@...ts.openwall.com>
Subject: Re: [PATCH 1/2] stackleak: Update for arm64

Alexander Popov <alex.popov@...ux.com> writes:
> On 27.02.2018 13:21, Richard Sandiford wrote:
>> Hi Alexander,
>> 
>> Sorry for the slow reply, been caught up in an office move.
>
> Thank you very much for the review, Richard!
>
>> Alexander Popov <alex.popov@...ux.com> writes:
>>> Would you be so kind to take a look at the whole STACKLEAK plugin?
>>> http://www.openwall.com/lists/kernel-hardening/2018/02/16/4
>>> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=kspp/gcc-plugin/stackleak&id=57a0a6763b12e82dd462593d0f42be610e93cdc9
>>>
>>> It's not very big. I documented it in detail. I would be really
>>> grateful for the
>>> review!
>> 
>> Looks good to me FWIW.  Just a couple of minor things:
>> 
>>> +	/*
>>> +	 * 1. Loop through the GIMPLE statements in each of cfun basic blocks.
>>> +	 * cfun is a global variable which represents the function that is
>>> +	 * currently processed.
>>> +	 */
>>> +	FOR_EACH_BB_FN(bb, cfun) {
>>> +		for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) {
>>> +			gimple stmt;
>>> +
>>> +			stmt = gsi_stmt(gsi);
>>> +
>>> +			/* Leaf function is a function which makes no calls */
>>> +			if (is_gimple_call(stmt))
>>> +				is_leaf = false;
>> 
>> It's probably not going to matter in practice, but it might be worth
>> emphasising in the comments that this leafness is only approximate.
>
> That's important, thank you! May I ask why you think it's not going to matter in
> practice?

I just thought the kind of calls it misses are going to have very
shallow frames, but from what you said later I guess that isn't the
point.  It also might be a bit too hand-wavy for something like this :-)

>> It will sometimes be a false positive, because we could still
>> end up creating calls to libgcc functions from non-call statements
>> (or for target-specific reasons).  It can also be a false negative,
>> since call statements can be to built-in or internal functions that
>> end up being open-coded.
>
> Oh, that raises the question: how does this leafness inaccuracy affect in my
> particular case?
>
> is_leaf is currently used for finding the special cases to skip the
> track_stack() call insertion:
>
> /*
>  * Special cases to skip the instrumentation.
>  *
>  * Taking the address of static inline functions materializes them,
>  * but we mustn't instrument some of them as the resulting stack
>  * alignment required by the function call ABI will break other
>  * assumptions regarding the expected (but not otherwise enforced)
>  * register clobbering ABI.
>  *
>  * Case in point: native_save_fl on amd64 when optimized for size
>  * clobbers rdx if it were instrumented here.
>  *
>  * TODO: any more special cases?
>  */
> if (is_leaf &&
>     !TREE_PUBLIC(current_function_decl) &&
>     DECL_DECLARED_INLINE_P(current_function_decl)) {
> 	return 0;
> }
>
>
> And now it seems to me that the stackleak plugin should not instrument all
> static inline functions, regardless of is_leaf. Do you agree?

OK.  I'd missed that this was just a heuristic to detect certain kinds
of linux function, so it's probably fine as it is.

Not sure whether it's safe to punt for general static inline functions.
E.g. couldn't you have a static inline function that just provides a
more convenient interface to another function?  But I guess it's a
linux-specific heuristic, so I can't really say.

TBH the paravirt save_fl stuff seems like dancing on the edge,
but that's another story. :-)

>>> +	/*
>>> +	 * The stackleak_final pass should be executed before the "final" pass,
>>> +	 * which turns the RTL (Register Transfer Language) into assembly.
>>> +	 */
>>> +	PASS_INFO(stackleak_final, "final", 1, PASS_POS_INSERT_BEFORE);
>> 
>> This might be too late, since it happens e.g. after addresses have
>> been calculated for branch ranges, and after machine-specific passes
>> (e.g. bundling on ia64).
>> 
>> The stack size is final after reload, so inserting the pass after that
>> might be better.
>
> Thanks for that notice. May I ask for the additional clarification?
>
> I specified -fdump-passes and see a lot of passes between reload and final:
>   ...
>   rtl-sched1                                       :  OFF
>   rtl-ira                                          :  ON
>   rtl-reload                                       :  ON
>   rtl-vzeroupper                                   :  OFF
>   *all-postreload                                  :  OFF
>      rtl-postreload                                :  OFF
>      rtl-gcse2                                     :  OFF
>      rtl-split2                                    :  ON
>      rtl-ree                                       :  ON
>      rtl-cmpelim                                   :  OFF
>      rtl-btl1                                      :  OFF
>      rtl-pro_and_epilogue                          :  ON
>      rtl-dse2                                      :  ON
>      rtl-csa                                       :  ON
>      rtl-jump2                                     :  ON
>      rtl-compgotos                                 :  ON
>      rtl-sched_fusion                              :  OFF
>      rtl-peephole2                                 :  ON
>      rtl-ce3                                       :  ON
>      rtl-rnreg                                     :  OFF
>      rtl-cprop_hardreg                             :  ON
>      rtl-rtl_dce                                   :  ON
>      rtl-bbro                                      :  ON
>      rtl-btl2                                      :  OFF
>      *leaf_regs                                    :  ON
>      rtl-split4                                    :  ON
>      rtl-sched2                                    :  ON
>      *stack_regs                                   :  ON
>         rtl-split3                                 :  OFF
>         rtl-stack                                  :  ON
>   *all-late_compilation                            :  OFF
>      rtl-alignments                                :  ON
>      rtl-vartrack                                  :  ON
>      *free_cfg                                     :  ON
>      rtl-mach                                      :  ON
>      rtl-barriers                                  :  ON
>      rtl-dbr                                       :  OFF
>      rtl-split5                                    :  OFF
>      rtl-eh_ranges                                 :  OFF
>      rtl-shorten                                   :  ON
>      rtl-nothrow                                   :  ON
>      rtl-dwarf2                                    :  ON
>      rtl-stackleak_final                           :  ON
>      rtl-final                                     :  ON
>   rtl-dfinish                                      :  ON
> clean_state                                        :  ON
>
> Where exactly would you recommend me to insert the stackleak_final pass, which
> removes the unneeded track_stack() calls?

Directly after rtl-reload seems best.  That's the first point at which
the frame size is final, and reload is one of the few rtl passes that
always runs.  Doing it there could also help with things like shrink
wrapping (part of rtl-pro_and_epilogue).

Thanks,
Richard

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.