Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20250529171556.GA9260@localhost.localdomain>
Date: Thu, 29 May 2025 17:17:08 +0000
From: Qualys Security Advisory <qsa@...lys.com>
To: "oss-security@...ts.openwall.com" <oss-security@...ts.openwall.com>
Subject: Local information disclosure in apport and systemd-coredump


Qualys Security Advisory

Local information disclosure in apport and systemd-coredump
(CVE-2025-5054 and CVE-2025-4598)


========================================================================
Contents
========================================================================

Summary
Mitigation
Local information disclosure in apport (CVE-2025-5054)
- Background
- Analysis
- Proof of concept
Local information disclosure in systemd-coredump (CVE-2025-4598)
- Background
- Analysis
- Proof of concept
Acknowledgments
Timeline


========================================================================
Summary
========================================================================

We discovered a vulnerability in apport (Ubuntu's core-dump handler),
and a similar vulnerability in systemd-coredump (which is the default
core-dump handler on Red Hat Enterprise Linux 9 and Fedora for example):
a race condition that allows a local attacker to crash a SUID program
and gain read access to the resulting core dump (by quickly replacing
the crashed SUID process with another process, before its /proc/pid/
files are analyzed by the vulnerable core-dump handler).

We developed two proofs of concepts for these vulnerabilities (one for
Ubuntu 24.04, and one for Fedora 40 and 41, but other distributions are
probably also vulnerable and exploitable): they allow a local attacker
to obtain the contents of /etc/shadow (password hashes) from the core
dump of a crashed unix_chkpwd process (unix_chkpwd is a SUID or SGID
program that is installed by default on most Linux distributions).

Last-minute update: while working on these vulnerabilities, we
eventually realized that systemd-coredump does not specify %d (the
kernel's per-process "dumpable" flag) in /proc/sys/kernel/core_pattern;
consequently a local attacker can crash (with kill(SIGSEGV) for example)
root daemons that fork() and setuid() to the attacker's uid, gain read
access to the resulting core dumps, and therefore to the root daemons'
memory. For example, we wrote a trivial proof of concept that dumps the
memory of OpenSSH's sshd-session, systemd's sd-pam, and the cron daemon,
and obtained secret information such as half of sshd's private ed25519
host key, password hashes from /etc/shadow, other users' crontabs, ASLR
addresses, stack canaries. This second attack (against root daemons) is
powerful, different from the first attack (against SUID programs), and
can certainly be further improved; and other secrets can certainly be
obtained from other daemons, but this is left as an exercise for the
interested reader.

The fix for these vulnerabilities is twofold:

- always take account of the kernel's per-process "dumpable" flag (the
  %d specifier), in every code path, to decide whether a non-root user
  should be given read access to a core dump or not;

- use the new %F specifier in /proc/sys/kernel/core_pattern (a pidfd to
  the crashed process), which was implemented during this coordinated
  vulnerability disclosure, to detect whether the crashed process was
  replaced or not with another process, before its analysis; for more
  information:

  https://lore.kernel.org/all/20250414-work-coredump-v2-0-685bf231f828@kernel.org/


========================================================================
Mitigation
========================================================================

To mitigate these vulnerabilities, /proc/sys/fs/suid_dumpable can be set
to 0 (SUID_DUMP_DISABLE, "No setuid dumping"). This prevents all SUID
programs and root daemons that drop privileges from being analyzed in
case of a crash, but it can act as a temporary fix if the vulnerable
core-dump handler itself cannot be patched immediately.


========================================================================
Local information disclosure in apport (CVE-2025-5054)
========================================================================

------------------------------------------------------------------------
Background
------------------------------------------------------------------------

After our discovery of three bypasses in Ubuntu's unprivileged user
namespace restrictions, we decided to look for a real-world example of a
vulnerability that requires a user namespace with full capabilities. One
perfectly obvious example would be a kernel vulnerability that requires
CAP_SYS_ADMIN or CAP_NET_ADMIN, but finding and exploiting such a kernel
vulnerability would most likely take us months, so we decided to look
for a simple userland vulnerability instead.

One target that immediately came to mind is apport, Ubuntu's core-dump
handler, because it suffered from multiple vulnerabilities related to
namespaces (containers) in the past; for example, the following
excellent write-ups by Tavis Ormandy and Sander Bos:

- CVE-2015-1318: https://www.openwall.com/lists/oss-security/2015/04/14/4
- CVE-2017-14180: https://bugs.launchpad.net/ubuntu/+source/apport/+bug/1726372
- CVE-2019-11483: https://bugs.launchpad.net/apport/+bug/1839420

But as soon as we started to read apport's source code, we realized that
it has been considerably hardened over the years:

- The most common attack vector against apport, which consisted in
  tricking apport into dumping an attacker-controlled core file into a
  root-owned directory such as /etc/sudoers.d/ or /etc/logrotate.d/, has
  been completely eradicated: apport now dumps all core files into a
  hard-coded directory (/var/lib/apport/coredump/ by default).

- The race condition that allows a local attacker to replace a crashed
  process with another process, before its /proc/pid/ files are analyzed
  but after apport has started, has been largely mitigated in apport (by
  thorough security checks in its consistency_checks() function).

To further detail this last point: perhaps surprisingly, a local
attacker can send a SIGKILL signal to an already-crashed process, thus
allowing the attacker to recycle the crashed process's pid (by creating
many new processes until the crashed-and-killed process's pid is reused)
and tricking apport into analyzing the /proc/pid/ files of the wrong
process. This race condition has been exploited several times in the
past; for example, the following outstanding write-ups by Philip
Pettersson, Kevin Backhouse, Ryota Shiga, and Itai Greenhut:

- CVE-2015-1325: https://www.openwall.com/lists/oss-security/2015/05/21/10
- CVE-2019-15790: https://github.blog/security/vulnerability-research/ubuntu-apport-pid-recycling-security-vulnerability-cve-2019-15790/
- CVE-2020-15702: https://flatt.tech/research/posts/race-condition-vulnerability-in-handling-of-pid-by-apport/
- CVE-2021-25684: https://alephsecurity.com/2021/02/16/apport-lpe/

But as mentioned earlier, this race condition has now been largely
mitigated in apport, by:

- immediately open()ing a file descriptor to the crashed process's
  /proc/pid/ directory and accessing all the files in this directory
  through this file descriptor and the *at() syscalls (openat() etc);

- checking that the starttime in /proc/pid/stat is earlier than the
  starttime of apport itself (i.e., ensuring that an attacker has not
  replaced the crashed process with another process, after apport has
  started);

- double-checking that the real Uid and Gid in /proc/pid/status still
  match the real uid and gid of the crashed process at the time of its
  crash.

Last-minute note: we eventually verified that the starttime check is in
fact useless, from a security point of view; an attacker can replace the
crashed process with another process even before apport starts, and with
the right timing can still give the kernel enough time to generate the
core dump of the originally crashed process.

------------------------------------------------------------------------
Analysis
------------------------------------------------------------------------

Unfortunately, while reading apport's code we noticed that the function
that handles crashes inside namespaces (_check_global_pid_and_forward(),
at line 769) is called before the aforementioned security checks are run
(in consistency_checks(), at line 951); in other words, an attacker can
trick apport's _check_global_pid_and_forward() into analyzing the wrong
process, while the kernel still sends the core dump of the originally
crashed process to apport (over its file descriptor 0, stdin):

------------------------------------------------------------------------
 750 def main(args: list[str]) -> int:
 ...
 769     if _check_global_pid_and_forward(options):
 770         return 0
 ...
 775         return process_crash_from_kernel(options)
------------------------------------------------------------------------
 921 def process_crash_from_kernel(options: argparse.Namespace) -> int:
 ...
 924             return process_crash_from_kernel_with_proc_pid(options, proc_pid)
------------------------------------------------------------------------
 941 def process_crash_from_kernel_with_proc_pid(
 942     options: argparse.Namespace, proc_pid: ProcPid
 943 ) -> int:
 ...
 951     if not consistency_checks(options, process_start, proc_pid, real_user):
 952         return 0
------------------------------------------------------------------------

And so an attack idea against apport began to form in our mind:

a/ first, we fork() a new process and execve() a SUID or SGID program,
and wait until it loads secret information into its memory (for example,
password hashes from /etc/shadow);

b/ second, we crash this process at the right time, by kill()ing it with
a core-dumping signal such as SIGSEGV or SIGSYS, thus causing the kernel
to create a new apport process to analyze this crash;

c/ then, after apport has started but before it analyzes the crashed
process's /proc/pid/ files, we SIGKILL the crashed process and quickly
replace it with another process that is not SUID or SGID, but that is
running inside a user, mount, and pid namespace (to pass the tests at
lines 726-727, below);

(note: naturally, we use one of our bypasses in Ubuntu's unprivileged
user namespace restrictions to create this namespace)

d/ as a result, apport connects to the Unix socket /run/apport.socket
inside our mount namespace (at lines 521-584, below) and sends us its
file descriptor 0, from where we can read the kernel-generated core dump
of the originally crashed process, and hence the secret information from
the memory of the SUID or SGID program (for example, password hashes).

------------------------------------------------------------------------
 712 def _check_global_pid_and_forward(options: argparse.Namespace) -> bool:
 ...
 726         if not is_same_ns(options.global_pid, "mnt"):
 727             if not is_same_ns(options.global_pid, "pid"):
 728                 forward_crash_to_container(options)
 729                 return True
------------------------------------------------------------------------
 509 def forward_crash_to_container(
 510     options: argparse.Namespace, coredump_fd: int = 0, has_cap_sys_admin: bool = True
 511 ) -> None:
 ...
 521     proc_host_pid_fd = os.open(
 522         f"/proc/{options.global_pid}", os.O_RDONLY | os.O_PATH | os.O_DIRECTORY
 523     )
 ...
 531         sock_fd = os.open(
 532             "root/run/apport.socket", os.O_RDONLY | os.O_PATH, dir_fd=proc_host_pid_fd
 533         )
 ...
 584             sock.connect(f"/proc/self/fd/{sock_fd}")
------------------------------------------------------------------------

To put this theoretical attack idea into practice, we must solve three
major problems:

1/ In step c/ we must SIGKILL the crashed process long before we can
read any information from the file descriptor 0 that apport sends to us.
This file descriptor 0 is the read end of a pipe whose internal 64KB
buffer is filled by the kernel with the beginning of the crashed
process's core dump, before we SIGKILL it.

The question, then, is: can we find a SUID or SGID program whose ELF
segments and heap fit into the pipe's internal 64KB buffer, and whose
heap contains secret information?

Luckily we found unix_chkpwd, a small (~31KB) SUID-root or SGID-shadow
program that is used by PAM to verify the password of a user, and which
therefore loads the contents of /etc/shadow (password hashes) into its
heap.

Last-minute note: while drafting this advisory, we realized that it
might be possible to use /proc/pid/coredump_filter to exclude the ELF
segments from the program's core dump, which might make it possible to
attack larger SUID programs such as su or sudo; this is left as an
exercise for the interested reader.

2/ In step b/ we must win a first race condition: we must crash the SUID
process "at the right time", with a SIGSEGV or SIGSYS for example. If we
crash it too early, then the password hashes from /etc/shadow are not
loaded into the heap yet; if we crash it too late, then the password
hashes in the heap are already overwritten with other information.

Ideally, to reliably win this race condition, we should add an
IN_CLOSE_NOWRITE watch on /etc/shadow, which would allow us to crash the
SUID process as soon as the contents of /etc/shadow are loaded into its
heap. Unfortunately, we cannot add such a watch, because /etc/shadow is
not readable by us. As a makeshift solution, we add an IN_CLOSE_NOWRITE
watch on /etc/passwd instead, which is opened and closed immediately
before /etc/shadow.

With our proof of concept, we almost always obtain some password hashes
from unix_chkpwd's heap, and from time to time we also obtain the entire
contents of /etc/shadow. In any case, we can simply re-execute our proof
of concept until we obtain the password hash that we are looking for,
and we believe that the reliability of this step b/ can still be
improved.

3/ In step c/ we must win a second race condition: we must SIGKILL the
crashed SUID process and "quickly" replace it with a non-SUID namespaced
process (before apport calls its _check_global_pid_and_forward()). If we
SIGKILL it too early, then the kernel does not have enough time to write
the beginning of the crashed process's core dump into apport's file
descriptor 0; if we SIGKILL it too late, then apport analyzes the
crashed process's /proc/pid/ (instead of our namespaced process's
/proc/pid/) and therefore does not send us its file descriptor 0.

In our experiments, and depending on the test machines, it takes between
1 and 4 minutes to replace the crashed SUID process with another process
(i.e., to recycle its pid) because /proc/sys/kernel/pid_max is 4M (2^22)
nowadays, not 32K (note: we call clone() with most of the CLONE_* flags
to create new processes, to minimize the work done by the kernel; it
would take 3 to 6 times longer if we were simply calling fork()).

Consequently, we cannot just "quickly" replace the crashed SUID process
in step c/; instead:

- in step b/ we do not immediately crash the SUID process (with SIGSEGV
  or SIGSYS), but we first SIGSTOP it, then create ~4M processes until
  their pids wrap around and almost reach the pid of the SUID process,
  and finally we crash (SIGSEGV or SIGSYS) and resume (SIGCONT) the SUID
  process;

- in step c/ we SIGKILL the crashed SUID process, and quickly create a
  mere handful of namespaced processes until their pids reach and reuse
  the pid of the crashed-and-killed SUID process.

Our proof of concept always wins this second race condition (the
"kill-and-replace" race condition): because apport is written in Python,
it loads numerous .pyc files during its initialization, so we simply add
an IN_OPEN watch on one of these files (on apt_dpkg.cpython-312.pyc for
example) and still have plenty of time (after apport has triggered our
watch point) to SIGKILL and replace the crashed SUID process with a
namespaced process (before apport analyzes its /proc/pid/ files).

------------------------------------------------------------------------
Proof of concept
------------------------------------------------------------------------

$ grep PRETTY_NAME= /etc/os-release
PRETTY_NAME="Ubuntu 24.04.2 LTS"

$ id
uid=1001(evey) gid=1001(evey) groups=1001(evey),100(users)

$ while true; do
    core="$(printf 'whatever\0' | ./CVE-2025-5054 /usr/sbin/unix_chkpwd "$USER" nullok)";
    if tr -c ' -~' '\n' < "$core" | grep '\$[0-9A-Za-z]\+\$[0-9A-Za-z./]'; then
        break;
    fi;
done

...
pid 1093
tid 1030
core will be dumped in /tmp/run.q5qBcg
signal 9
accept 4
args 1093 31 18446744073709551615 2
fd 5
died in child_userns: 151
status 1
died in main: 280
$y$j9T$KC0.pKjUYzrr3L8VVNQ8l/$11KufHkbNKHRxgolryPxDQDZ.Ox9kG4RIv0Pxe1FgxA
$y$j9T$KC0.pKjUYzrr3L8VVNQ8l/$11KufHkbNKHRxgolryPxDQDZ.Ox9kG4RIv0Pxe1FgxA
theadmin:$6$7Ag0AvjQl4XQvSO4$T1mMcQeC0K7FICHEj9pNV20XcUX4IW6Xqg45lyuORtia1vPCOy2ZrFlTa.ZEf0EAO6rpNRma1ucCjO3aL64KW0:20145:0:99999:7:::
evey:$y$j9T$KC0.pKjUYzrr3L8VVNQ8l/$11KufHkbNKHRxgolryPxDQDZ.Ox9kG4RIv0Pxe1FgxA:20145:0:99999:7:::


========================================================================
Local information disclosure in systemd-coredump (CVE-2025-4598)
========================================================================

------------------------------------------------------------------------
Background
------------------------------------------------------------------------

While working on Ubuntu's apport, we remembered that various other
distributions (Red Hat Enterprise Linux 9 and Fedora for example) use
systemd-coredump as a core-dump handler in /proc/sys/kernel/core_pattern
(instead of apport). We began to wonder: how does systemd-coredump solve
the kill-and-replace race condition that we exploited against apport?

Similarly to apport, systemd-coredump writes all core files into a
hard-coded directory, /var/lib/systemd/coredump/. Before December 2022,
systemd-coredump allowed users to read all of their core files (through
file ACLs), including the core files of SUID or SGID programs, which of
course allowed local attackers to read the contents of /etc/shadow by
simply crashing su for example; this vulnerability was CVE-2022-4415,
discovered and published by Matthias Gerstner:

  https://www.openwall.com/lists/oss-security/2022/12/21/3

This old vulnerability was patched by introducing a new function,
grant_user_access(), which decides whether a user should be allowed to
read a core file or not, by analyzing the /proc/pid/auxv of the crashed
process: if its AT_UID and AT_EUID match, and if its AT_GID and AT_EGID
match, and if its AT_SECURE flag is 0, then read access is allowed;
otherwise (if the crashed process is SUID or SGID), read access is
denied (only root can read the core file).

------------------------------------------------------------------------
Analysis
------------------------------------------------------------------------

Unfortunately, we soon realized that systemd-coredump does not provide
any protection at all against the kill-and-replace race condition that
we exploited in apport. In other words, an attacker can simply crash a
SUID process such as unix_chkpwd, SIGKILL and replace it with a non-SUID
process (before its /proc/pid/auxv is analyzed by systemd-coredump), and
therefore gain read access to the core file of the crashed SUID process,
and hence to the contents of /etc/shadow.

On the one hand, exploiting systemd-coredump is easier than exploiting
apport, because we do not need to replace the crashed SUID process with
a namespaced process: we can replace it with any non-SUID process, whose
AT_UID and AT_EUID match, whose AT_GID and AT_EGID match, and whose
AT_SECURE flag is 0.

On the other hand, winning the kill-and-replace race condition against
systemd-coredump is harder: unlike apport, systemd-coredump is written
in C, and its initialization takes little time. To widen the window of
the race condition, we pass an argv[0] of 128K '\177' characters to the
SUID process: this slows down the analysis of its /proc/pid/cmdline (by
systemd-coredump, before the analysis of its /proc/pid/auxv) and gives
us enough time to replace the crashed SUID process with a non-SUID
process.

------------------------------------------------------------------------
Proof of concept
------------------------------------------------------------------------

$ grep PRETTY_NAME= /etc/os-release
PRETTY_NAME="Fedora Linux 41 (Server Edition)"

$ id
uid=1001(evey) gid=1001(evey) groups=1001(evey) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023

$ while true; do
    pid="$(printf 'whatever\0' | ./CVE-2025-4598 /usr/sbin/unix_chkpwd "$USER" nullok)";
    pidwait -f /usr/lib/systemd/systemd-coredump;
    if coredumpctl -1 dump "$pid" 2>/dev/null | strings -a | grep '\$[0-9A-Za-z]\+\$[0-9A-Za-z./]'; then
        break;
    fi;
done

...
pid 364536
tid 364521
tid 364540
died in main: 177
theadmin:$y$j9T$APKdqQO.brzhEbC2JFd.5zb7$Rz2q.0umBr8AmkwlozWr8/yphm/ckEHIOMo9vcj.Wj/::0:99999:7:::
evey:$y$j9T$QUW3HEErO9CYuGrRhiQjt.$.befySFW/nA48280u/Hk1XrcA2yDZ6Z1s7iRf91nJuA:20188:0:99999:7:::


========================================================================
Acknowledgments
========================================================================

We thank Ubuntu's security team and apport's developers (Octavio Galland
and Benjamin Drung in particular), and systemd's developers (Zbigniew
Jedrzejewski Szmek and Luca Boccassi in particular), for their hard work
on this release. We also thank Red Hat Product Security (Marco Benatto
in particular), and the linux-distros@...nwall (Solar Designer, Seth
Arnold, Salvatore Bonaccorso, David Fernandez Gonzalez, in particular),
for their help with this release. Finally, we thank Christian Brauner
for the %F/pidfd kernel feature.


========================================================================
Timeline
========================================================================

2025-03-21: We sent a draft of our advisory and a first proof of concept
(against unix_chkpwd) to Ubuntu's security team.

2025-04-10: We sent a draft of our advisory and a first proof of concept
(against unix_chkpwd) to systemd's developers.

2025-04-17: We sent a second proof of concept (against sshd) to
systemd's developers.

2025-04-22: We sent a second proof of concept (which defeats apport's
starttime check) to Ubuntu's security team and apport's developers.

2025-05-23: We sent a draft of our advisory to the
linux-distros@...nwall.

2025-05-29: Coordinated Release Date (16:00 UTC).

Powered by blists - more mailing lists

Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.