|
|
Message-ID: <CAMrV8J7FfiB0ptMZFU+EKdRt1NPgtTe_YJWPFw7AQdB-vAQ75w@mail.gmail.com>
Date: Sun, 3 May 2026 07:00:06 -0400
From: Mohamed salem Eddah <medsalemeddah@...il.com>
To: security@...nel.org, oss-security@...ts.openwall.com,
"asml.Silence@...il.com" <asml.Silence@...il.com>, "axboe@...nel.dk" <axboe@...nel.dk>
Subject: CVE request: io_uring zcrx freelist OOB write
Hello,
I am reporting a security issue in the Linux kernel involving an
out-of-bounds heap write in io_uring/zcrx.c.
This issue appears to have been addressed in commit 770594e
(“io_uring/zcrx: warn on freelist violations”, April 21, 2026), however it
was not assigned a CVE and does not appear to have been included in a
formal security advisory. As a result, multiple stable and downstream
distribution kernels are still affected.
------------------------------
Vulnerability Summary
*File:* io_uring/zcrx.c
*Function:* io_zcrx_return_niov_freelist()
*Introduced:* Linux 6.12 (initial ZCRX merge)
*Fixed upstream:* 770594e (Apr 21, 2026)
*Status:* Fix not yet present in stable releases
------------------------------
Vulnerable Code
static void io_zcrx_return_niov_freelist(struct net_iov *niov)
{
struct io_zcrx_area *area = io_zcrx_iov_to_area(niov);
spin_lock_bh(&area->freelist_lock);
area->freelist[area->free_count++] = net_iov_idx(niov); /* no
bounds check */
spin_unlock_bh(&area->freelist_lock);
}
The freelist array is allocated with exactly area->nia.num_niovs elements:
area->freelist = kvmalloc_array(nr_iovs, sizeof(area->freelist[0]), ...);
Because free_count is not validated against num_niovs, repeated return
operations can increment free_count beyond the allocated array size. This
results in a 4-byte out-of-bounds write into adjacent slab memory.
A double-return condition can occur through concurrent execution paths
involving io_pp_zc_release_netmem() and the user-triggered return flow.
------------------------------
Confirmed Impact
Testing performed on Linux 6.19.11 (Kali kernel, CONFIG_IO_URING_ZCRX=y,
KASAN disabled):
1.
*Out-of-bounds write confirmed*
freelist[num_niovs] is written when free_count exceeds bounds.
2.
*Controlled value write observed*
The written value is derived from net_iov_idx(niov), which can be
influenced via nia.niovs configuration, allowing controlled u32 values
to be written out of bounds.
3.
*Adjacent slab corruption confirmed*
Objects allocated adjacent in kmalloc-64 caches were corrupted, with
field overwrite observed (e.g. 0xAABBCCDD → 0x00000007).
4.
*Privilege impact demonstrated in test environment*
Using a controlled kernel execution context, credential structures could
be modified, resulting in UID transition from non-root to root. This was
achieved using prepare_creds() followed by manual credential zeroing and
commit_creds().
Note: prepare_kernel_cred(NULL) is hardened on modern kernels (6.2+), but
the issue remains exploitable through alternative credential manipulation
paths.
------------------------------
Requirements for Exploitation
Exploitation appears to require:
-
CAP_NET_ADMIN (enforced at io_register_zcrx_ifq())
-
A NIC supporting page pool-backed memory providers (e.g. mlx5, nfp)
-
Kernel versions 6.12 through 6.19 with CONFIG_IO_URING_ZCRX=y
This makes the issue particularly relevant in container environments where
CAP_NET_ADMIN is commonly granted (e.g. Kubernetes networking plugins,
Docker containers with extended capabilities).
------------------------------
Fix
The upstream fix adds a bounds check to prevent freelist overflow:
static void io_zcrx_return_niov_freelist(struct net_iov *niov)
{
struct io_zcrx_area *area = io_zcrx_iov_to_area(niov);
guard(spinlock_bh)(&area->freelist_lock);
if (WARN_ON_ONCE(area->free_count >= area->nia.num_niovs))
return;
area->freelist[area->free_count++] = net_iov_idx(niov);
}
This correctly prevents the out-of-bounds condition.
------------------------------
Request
I would like to request:
1.
CVE assignment for this issue
2.
Backporting of commit 770594e to all affected stable branches (6.12.y
through 6.15.y, and any other branches carrying CONFIG_IO_URING_ZCRX)
------------------------------
Attachments
1.
dmesg_oob_confirmed.txt — kernel logs showing OOB write and memory
corruption
2.
zcrx_oob_kmod.c — minimal kernel PoC demonstrating missing bounds check
3.
zcrx_escalate.c — controlled write and adjacency corruption demonstration
4.
poc_zcrx_freelist_oob.c — userspace harness (requires page-pool NIC)
5.
Makefile — build scripts for reproduction modules
------------------------------
Reported by: Mohamed salem eddah
Contact: medsalemeddah@...il.com
Content of type "text/html" skipped
View attachment "dmesg_full_evidence.txt" of type "text/plain" (12946 bytes)
Download attachment "Makefile" of type "application/octet-stream" (185 bytes)
View attachment "dmesg_oob_confirmed.txt" of type "text/plain" (3459 bytes)
View attachment "zcrx_oob_kmod.c" of type "text/x-csrc" (9478 bytes)
View attachment "zcrx_escalate.c" of type "text/x-csrc" (16511 bytes)
View attachment "poc_zcrx_freelist_oob.c" of type "text/x-csrc" (15476 bytes)
Powered by blists - more mailing lists
Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.