kernel-hardening - Re: [PATCH 00/22] add support for Clang LTO

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200625085745.GD117543@hirez.programming.kicks-ass.net>
Date: Thu, 25 Jun 2020 10:57:45 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Nick Desaulniers <ndesaulniers@...gle.com>
Cc: Sami Tolvanen <samitolvanen@...gle.com>,
	Masahiro Yamada <masahiroy@...nel.org>,
	Will Deacon <will@...nel.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	"Paul E. McKenney" <paulmck@...nel.org>,
	Kees Cook <keescook@...omium.org>,
	clang-built-linux <clang-built-linux@...glegroups.com>,
	Kernel Hardening <kernel-hardening@...ts.openwall.com>,
	linux-arch <linux-arch@...r.kernel.org>,
	Linux ARM <linux-arm-kernel@...ts.infradead.org>,
	Linux Kbuild mailing list <linux-kbuild@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>, linux-pci@...r.kernel.org,
	"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>
Subject: Re: [PATCH 00/22] add support for Clang LTO

On Thu, Jun 25, 2020 at 10:24:33AM +0200, Peter Zijlstra wrote:
> On Thu, Jun 25, 2020 at 10:03:13AM +0200, Peter Zijlstra wrote:

> > I'm sure Will will respond, but the basic issue is the trainwreck C11
> > made of dependent loads.
> > 
> > Anyway, here's a link to the last time this came up:
> > 
> >   https://lore.kernel.org/linux-arm-kernel/20171116174830.GX3624@linux.vnet.ibm.com/
> 
> Another good read:
> 
>   https://lore.kernel.org/lkml/20150520005510.GA23559@linux.vnet.ibm.com/
> 
> and having (partially) re-read that, I now worry intensily about things
> like latch_tree_find(), cyc2ns_read_begin, __ktime_get_fast_ns().
> 
> It looks like kernel/time/sched_clock.c uses raw_read_seqcount() which
> deviates from the above patterns by, for some reason, using a primitive
> that includes an extra smp_rmb().
> 
> And this is just the few things I could remember off the top of my head,
> who knows what else is out there.

As an example, let us consider __ktime_get_fast_ns(), the critical bit
is:

		seq = raw_read_seqcount_latch(&tkf->seq);
		tkr = tkf->base + (seq & 0x01);
		now = tkr->base;

And we hard rely on that being a dependent load, so:

  LOAD	seq, (tkf->seq)
  LOAD  tkr, tkf->base
  AND   seq, 1
  MUL   seq, sizeof(tk_read_base)
  ADD	tkr, seq
  LOAD  now, (tkr->base)

Such that we obtain 'now' as a direct dependency on 'seq'. This ensures
the loads are ordered.

A compiler can wreck this by translating it into something like:

  LOAD	seq, (tkf->seq)
  LOAD  tkr, tkf->base
  AND   seq, 1
  CMP	seq, 0
  JE	1f
  ADD	tkr, sizeof(tk_read_base)
1:
  LOAD  now, (tkr->base)

Because now the machine can speculate and load now before seq, breaking
the ordering.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.