Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250710171105.GY1827@brightrain.aerifal.cx>
Date: Thu, 10 Jul 2025 13:11:05 -0400
From: Rich Felker <dalias@...c.org>
To: Nathan McSween <nwmcsween@...il.com>
Cc: musl@...ts.openwall.com
Subject: Re: unlink on NFS volume fails silently

On Thu, Jul 10, 2025 at 10:01:50AM -0700, Nathan McSween wrote:
> https://github.com/Azure/AKS/issues/1325#issuecomment-713372369, does the
> behavior happen with coreutils?

That thread looks like it has a lot of misinformation. In particular,
the comment you linked makes an erroneous claim:

    "If a file is removed from or added to the directory after the
    most recent call to opendir() or rewinddir(), whether a subsequent
    call to readdir() returns an entry for that file is unspecified.

    So the different filesystems are left free to choose their own
    behaviour when this happens. cifs.ko (the Linux SMB client) makes

The above first part is true. Implementations are free to choose
whether to show stale (already deleted) entries or do the extra work
to suppress them. However...

    sure that it's not returning stale data, at the cost of missing
    some entries for this particular use case"

...that does NOT give them license to break conformance by "missing
some entries". If your mitigation for showing stale file entries
involves failure to show some other non-stale ones, it's broken, and
needs to be removed.

Rich


> On Thu, Jul 10, 2025, 8:44 AM Rich Felker <dalias@...c.org> wrote:
> 
> > On Thu, Jul 10, 2025 at 02:58:30PM +1000, Stephen Von Takach wrote:
> > > Yeah I see your point and this was closed as a kernel issue:
> > > https://gitlab.alpinelinux.org/alpine/aports/-/issues/10960
> >
> > OK, is your issue unlink falsely succeeding, or readdir skipping
> > entries? The latter is a known bug in the kernel NFS client. One of my
> > comments on the tracker suggests:
> >
> >   "The nordirplus option mentioned in one of those tracker threads
> >   might be a workaround."
> >
> > I'm not sure if this is the case, but it might be worth trying.
> >
> > Note that it's *expected* that an already-in-progress iteration of a
> > directory may return entries that were already deleted. The
> > unacceptable thing is the opposite: when it skips some entries that
> > have not been deleted as a consequence of other things being deleted.
> >
> > > We're running these two containers on the same kernel and seeing the same
> > > behaviour as that alpine issue.
> > > Happy to continue working around the issue by using debian userspace to
> > > build our service.
> > >
> > > It does seems crazy that there is clearly an issue, possibly a kernel
> > issue
> > > that is being handwaved away by all parties
> >
> > It's not "handwaved away" by us. We have determined that there is a
> > bug in a component we have no control over, and for which we have no
> > sound means of working around.
> >
> > I'm happy to work together on tracking down the cause to get it fixed,
> > but that requires cooperation from someone who's able to reproduce it,
> > documenting the exact circumstances under which it occurs (NFS server
> > vendor/version, NFS mount options) and either producing a minimal test
> > program to reproduce the issue under those conditions, or being
> > willing to run a proposed test by someone else.
> >
> > Even if using Debian/glibc *seems* to make things work for you, I
> > think it would be beneficial for you to try to get to the root cause
> > of the problem and get it fixed. What we previously found on the
> > above-linked ticket was that glibc is not doing anything special that
> > should rule out that bug, only that the particular filename
> > sizes/counts in the test didn't trigger the bug with glibc.
> >
> > Again, I don't know if this is the same bug you're hitting (this is
> > the first time in the thread you've mentioned readdir if I'm not
> > mistaken, as opposed to just unlink) or if there's a second bug in
> > play here. If you could at least clarify that, it would be a big help
> > to anyone investigating it in the future.
> >
> > Rich
> >

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.