Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM=PXV5Nu8VdpFY7mYmA86ddwYR7tR0nyLL_nZEnUdrz83Y=Rg@mail.gmail.com>
Date: Sun, 3 May 2026 13:20:09 -0600
From: Greg Dahlman <dahlman@...il.com>
To: oss-security@...ts.openwall.com, linux-crypto@...r.kernel.org, 
	Linux kernel mailing list <linux-kernel@...r.kernel.org>
Subject: Re: CVE-2026-31431: CopyFail: linux local privilege scalation

Note: re-adding the other lists so that people have an opportunity to
correct my errors.

"CAP_FOO in the init namespace" doesn't matter if "CAP_FOO" is the
gate in the default namespace, namespaces a facade-pattern and not an
isolation, unix abstract sockets, af_inet, vsock, af_alg etc... do not
currently use credentials at all IIRC.


LD_PRELOAD, as a way to (transparently) replace this functionality
without user intervention , involves putting in an interposer to
directly intercept all socket() calls at a system scale in this case,
when it is typically a thread scope concept.

I think socket() hasn't been interposable for at least a decade (in
glibc) you will weaken overall security by reintroducing the PLT or...
Note many people want to avoid adding in a `/etc/ld.so.preload`
because fighting dynamic linker hijacking is not easy due to Unix-like
systems having zero security boundary between the parent and child
process.

The bigger problem is that the embedded users are not where most of
the friction is going to come from, while the motivations are similar,
FIPS 140-3 validation, and downstream vendors which used distros
validations, incorporated into regilitory, compliance, and governance
is a large unidentified user base.

Searching for "Kernel Crypto API" in the Module name on this site will
show some of the upstream validations.

   https://csrc.nist.gov/projects/cryptographic-module-validation-program/validated-modules/search


In the case of non path backed sockets, userns provides zero
protections and only adds to the attack surface, the only credential
use for non-path backed sockets currently is the restriction of ports
below 1024 on af_inet,

Remember namespace support is not implicit, and all af_family calls
outside of those specific families that have namespace support all
stay in the default namespace.

If you dig through the $distro openssl security documents from the
NIST link above from the vendors you will see why people liked the
contract that af_alg offered, because they were depending on the
kernel teams stable api and reputation. and they could simplify their
compliance because it is easier to ensure no openssl installs exist at
all on a system than to try and maintain compliance, governance, and
regulatory obligations.

While there are some use cases like firmware images on some embedded
systems, where having the DMA pipe into a cryptoengine avoided Von
Neumann bottleneck issues and CPU usage etc.. No matter how flawed it
is to use af_alg, it provided a simple zero dependency interface that
their tools already supported (socket) and reduced their lifecycle
costs.  The reduced performance of the hw crypto engines for smaller
data sizes was acceptable as a trade-off, not as the primary driver in
many cases.

I should be 100% clear, namespaces _are not_ a security feature, but
they can be leveraged to lower privileges and improve a security
posture.  But when you have interfaces like sockets (non unix like)
the main advantage of network namespaces is they allow you to
constrain something that due to historical reasons has almost zero
controls (except tcp ports < 1024).

But the default is for any new, legacy or other subsystem to only live
in the default namespace.  The friction is when ~4 out of the 40+
af_families is namespace but the rest are not.

There is a very real problem with people overestimating the isolation
capabilities of namespaces in general, but paying attention to the
official documentation may help here:

https://www.kernel.org/doc/html/latest/admin-guide/namespaces/compatibility-list.html

     The same is true for the IPC namespaces being shared - two users
from different user namespaces should not access the same IPC objects
even having equal UIDs.
     But currently this is not so.

The "should not access" is a very different contract than most people expect.

The FIPS/ISO compliance issue mostly invalidates what I hoped was an
easy fix and putting a kernel call interposer via ld_preload will
still add  friction that is likely to block the aspirations of
removing af_alg from the kernel.  I think that there is a path to do
so, and I think it would be best in the long run.  But the friction
here is not just from code changes, which are far easier to accomplish
than the regulatory issues.

The compliance based user base is one that is often far more challenging.

I do still think that both userland and kernel would benefit from some
mechanism that would make it easier for security teams, admins, and
users to run with lower privileges.  IMHO thinking about enabling that
control will also be critical to the kernel team's ability to remain
effective.  Different use cases will always conflict, and
non-namespace users would also benefit from ways to restrict access to
af_families.

IMHO if the team thinks af_alg is unfixable, it is maybe one of the
rare cases where breaking changes are necessary.  It may be more
productive to help compliance based users migrate than provide a
brittle shim that still invalidates all their authorizations anyway.

I am not an expert on FIPS/ISO compliance, but I do know that
providing guidance that helps users migrate would go a long way.  You
could say, have a userland process that provides a socket-like
interface with guidance on how to wrap or create a their_socket() to
migrate.

I still think that for non af_inet/unix (file backed)socket af
families, there needs to be a credentials mechanism.  People are
building systems on top of vsock and other non unix/if based systems
that are just as vulnerable. Like af_alg, vsock is known to have
serious issues and was designed for a trusted environment.  Without an
effective way to limit exposure from either userland or the kernel
there is enough that is simply just unexplored that it will be
expensive.

On Sun, May 3, 2026 at 5:00 AM Simon McVittie <smcv@...ian.org> wrote:
>
> On Sat, 02 May 2026 at 14:21:57 -0600, Greg Dahlman wrote:
> >LD_PRELOAD and capabilities
>
> These seem orthogonal, rather than being part of the same idea.
>
> LD_PRELOAD is discretionary (cooperative) so it would only be useful if
> used in a design something like this:
>
> - at the kernel level, AF_ALG just doesn't work (fails with a
>    permission-related error), at least for unprivileged processes
> - but in user-space, an opt-in LD_PRELOAD module intercepts the socket(),
>    etc. calls for AF_ALG, and emulates the behaviour of current kernels
>    by calling into a user-space crypto library
>
> It can't be a security boundary, but it can be a mitigation for the
> regressions that a new security boundary (or complete feature removal)
> would otherwise cause, similar to the way LD_PRELOADs like aoss and
> padsp mitigated the regressions for older binaries when distro kernels
> disabled OSS audio.
>
> Meanwhile capabilities are a way to let trusted, privileged processes
> have access to things that unprivileged processes do not, for example
> making AF_ALG available to a few system services that need it but not
> available to all of user-space.
>
> >You should expect any UID (even nobody) to be able to gain the
> >privileges in their bounding set
>
> The kernel can distinguish between "CAP_FOO in the init namespace" and
> "CAP_FOO in any other userns" if it wants to, and some kernel features
> are already gated by having a capability in the init namespace
> specifically. For example CAP_SYS_ADMIN in the init namespace allows
> mounting block-device-backed filesystems like ext4, but CAP_SYS_ADMIN in
> a different userns only allows a few "safe" mount operations
> (bind-mounts, overlayfs, FUSE).
>
>      smcv

Powered by blists - more mailing lists

Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.