kernel-hardening - Re: [PATCH] x86: entry: flush the cache if syscall error

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1539293003.3566.15.camel@linux.intel.com>
Date: Thu, 11 Oct 2018 14:23:23 -0700
From: Kristen C Accardi <kristen@...ux.intel.com>
To: Kees Cook <keescook@...omium.org>, Andy Lutomirski <luto@...nel.org>
Cc: Kernel Hardening <kernel-hardening@...ts.openwall.com>, Thomas Gleixner
 <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav Petkov
 <bp@...en8.de>, "H. Peter Anvin" <hpa@...or.com>, X86 ML <x86@...nel.org>,
 LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86: entry: flush the cache if syscall error

On Thu, 2018-10-11 at 13:55 -0700, Kees Cook wrote:
> On Thu, Oct 11, 2018 at 1:48 PM, Andy Lutomirski <luto@...nel.org>
> wrote:
> > On Thu, Oct 11, 2018 at 11:55 AM Kristen Carlson Accardi
> > <kristen@...ux.intel.com> wrote:
> > > 
> > > This patch aims to make it harder to perform cache timing attacks
> > > on data
> > > left behind by system calls. If we have an error returned from a
> > > syscall,
> > > flush the L1 cache.
> > > 
> > > It's important to note that this patch is not addressing any
> > > specific
> > > exploit, nor is it intended to be a complete defense against
> > > anything.
> > > It is intended to be a low cost way of eliminating some of side
> > > effects
> > > of a failed system call.
> > > 
> > > A performance test using sysbench on one hyperthread and a script
> > > which
> > > attempts to repeatedly access files it does not have permission
> > > to access
> > > on the other hyperthread found no significant performance impact.
> > > 
> > > Suggested-by: Alan Cox <alan@...ux.intel.com>
> > > Signed-off-by: Kristen Carlson Accardi <kristen@...ux.intel.com>
> > > ---
> > >  arch/x86/Kconfig        |  9 +++++++++
> > >  arch/x86/entry/common.c | 18 ++++++++++++++++++
> > >  2 files changed, 27 insertions(+)
> > > 
> > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > > index 1a0be022f91d..bde978eb3b4e 100644
> > > --- a/arch/x86/Kconfig
> > > +++ b/arch/x86/Kconfig
> > > @@ -445,6 +445,15 @@ config RETPOLINE
> > >           code are eliminated. Since this includes the syscall
> > > entry path,
> > >           it is not entirely pointless.
> > > 
> > > +config SYSCALL_FLUSH
> > > +       bool "Clear L1 Cache on syscall errors"
> > > +       default n
> > > +       help
> > > +         Selecting 'y' allows the L1 cache to be cleared upon
> > > return of
> > > +         an error code from a syscall if the CPU supports
> > > "flush_l1d".
> > > +         This may reduce the likelyhood of speculative execution
> > > style
> > > +         attacks on syscalls.
> > > +
> > >  config INTEL_RDT
> > >         bool "Intel Resource Director Technology support"
> > >         default n
> > > diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> > > index 3b2490b81918..26de8ea71293 100644
> > > --- a/arch/x86/entry/common.c
> > > +++ b/arch/x86/entry/common.c
> > > @@ -268,6 +268,20 @@ __visible inline void
> > > syscall_return_slowpath(struct pt_regs *regs)
> > >         prepare_exit_to_usermode(regs);
> > >  }
> > > 
> > > +__visible inline void l1_cache_flush(struct pt_regs *regs)
> > > +{
> > > +       if (IS_ENABLED(CONFIG_SYSCALL_FLUSH) &&
> > > +           static_cpu_has(X86_FEATURE_FLUSH_L1D)) {
> > > +               if (regs->ax == 0 || regs->ax == -EAGAIN ||
> > > +                   regs->ax == -EEXIST || regs->ax == -ENOENT ||
> > > +                   regs->ax == -EXDEV || regs->ax == -ETIMEDOUT
> > > ||
> > > +                   regs->ax == -ENOTCONN || regs->ax ==
> > > -EINPROGRESS)
> > 
> > What about ax > 0?  (Or more generally, any ax outside the range of
> > -1
> > .. -4095 or whatever the error range is.)  As it stands, it looks
> > like
> > you'll flush on successful read(), write(), recv(), etc, and that
> > could seriously hurt performance on real workloads.
> 
> Seems like just changing this with "ax == 0" into "ax >= 0" would
> solve that?

thanks, will do.

> 
> I think this looks like a good idea. It might be worth adding a
> comment about the checks to explain why those errors are whitelisted.
> It's a cheap and effective mitigation for "unknown future problems"
> that doesn't degrade normal workloads.
> 
> > > +                       return;
> > > +
> > > +               wrmsrl(MSR_IA32_FLUSH_CMD, L1D_FLUSH);
> 
> What about CPUs without FLUSH_L1D? Could it be done manually with a
> memcpy or something?

It could - my original implementation (pre l1d_flush msr) did, but it
did come with some additional cost in that I allocated per-cpu memory
to keep a 32K buffer around that I could memcpy. It also sacrificed
completeness for simplicity by not taking into account cases where L1
was not 32K. As far as I know this msr is pretty widely deployed, even
on older hardware.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.