Date: Fri, 23 Oct 2020 01:24:14 +0300 From: Topi Miettinen <toiwoton@...il.com> To: Kees Cook <keescook@...omium.org> Cc: Szabolcs Nagy <szabolcs.nagy@....com>, Jeremy Linton <jeremy.linton@....com>, "linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>, libc-alpha@...rceware.org, systemd-devel@...ts.freedesktop.org, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Mark Rutland <mark.rutland@....com>, Mark Brown <broonie@...nel.org>, Dave Martin <dave.martin@....com>, Catalin Marinas <Catalin.Marinas@....com>, Will Deacon <will.deacon@....com>, Salvatore Mesoraca <s.mesoraca16@...il.com>, kernel-hardening@...ts.openwall.com, linux-hardening@...r.kernel.org Subject: Re: BTI interaction between seccomp filters in systemd and glibc mprotect calls, causing service failures On 22.10.2020 23.02, Kees Cook wrote: > On Thu, Oct 22, 2020 at 01:39:07PM +0300, Topi Miettinen wrote: >> But I think SELinux has a more complete solution (execmem) which can track >> the pages better than is possible with seccomp solution which has a very >> narrow field of view. Maybe this facility could be made available to >> non-SELinux systems, for example with prctl()? Then the in-kernel MDWX could >> allow mprotect(PROT_EXEC | PROT_BTI) in case the backing file hasn't been >> modified, the source filesystem isn't writable for the calling process and >> the file descriptor isn't created with memfd_create(). > > Right. The problem here is that systemd is attempting to mediate a > state change using only syscall details (i.e. with seccomp) instead of > a stateful analysis. Using a MAC is likely the only sane way to do that. > SELinux is a bit difficult to adjust "on the fly" the way systemd would > like to do things, and the more dynamic approach seen with SARA isn't > yet in the kernel. SARA looks interesting. What is missing is a prctl() to enable all W^X protections irrevocably for the current process, then systemd could enable it for services with MemoryDenyWriteExecute=yes. I didn't also see specific measures against memfd_create() or file system W&X, but perhaps those can be added later. Maybe pkey_mprotect() is not handled either unless it uses the same LSM hook as mprotect(). > Trying to enforce memory W^X protection correctly > via seccomp isn't really going to work well, as far as I can see. Not in general, but I think it can work well in context of system services. Then you can ensure that for a specific service, memfd_create() is blocked by seccomp and the file systems are W^X because of mount namespaces etc., so there should not be any means to construct arbitrary executable pages. -Topi
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.