|
|
Message-ID: <20251203154406.GC1827@brightrain.aerifal.cx> Date: Wed, 3 Dec 2025 10:44:06 -0500 From: Rich Felker <dalias@...c.org> To: Neeraj Sharma <neerajsharma.live@...il.com> Cc: musl@...ts.openwall.com Subject: Re: Re: Potential bug in musl in nftw implementation On Wed, Dec 03, 2025 at 08:25:03PM +0530, Neeraj Sharma wrote: > On Wed, Dec 3, 2025 at 7:08 PM Rich Felker <dalias@...c.org> wrote: > > > > Are you saying you'd deem it a better behavior to return with an error > > when it can't traverse within the fd limit, instead of skipping > > descent past what it can do within the limit? > > > > That's probably more correct. My understanding is that historically, > > the limit was a depth limit, but POSIX repurposed it into a "fd limit" > > with an expectation that implementations somehow do the traversal with > > fewer fds than the depth if the limit is exceeded (or error out?). > > Trying to actually do it seems error-prone (requires unbounded working > > space that grows with size of a single directory, vs growth only with > > depth for fds, and involves caching potentially-stale data) but maybe > > just erroring so the caller knows to use a larger limit would be the > > right thing to do here...? > > I would suggest aligning with common understanding across nix in this > case. This was the main reason for my confusion in the beginning. > Silently skipping or erroring out both seems unaligned with common > understanding as in [1], [2], [3]. The documentation in linux [1] is > more explicit about the functionality than IEEE [2] or opengroup [3] > in this case. > > Quotes from [1] or linux man page ftw(3). > > "To avoid using up all of the calling process's file descriptors, > nopenfd specifies the maximum number of directories that ftw() will > hold open simultaneously. When the search depth exceeds this, ftw() > will become slower because directories have to be closed and reopened. > ftw() uses at most one file descriptor for each level in the directory > tree." This sounds like a documentation bug. "Slower because directories have to be closed and reopened" is not an accurate description of what happens if you close directory fds to stay under a limit. The reality is that the working space requirements become unbounded, because you can't just close and reopen a directory and continue where you left off. The seek address is only valid until the directory is closed, so the only way to continue where you left off is to continue reading all the remaining entries and store them in memory before closing it. That's not to say we shouldn't make a change here (tho actually buffering remaining files rather than just erroring would be some new work to implement), but if folks are calling it with a low fd_limit argument based on the above advice, that should be fixed too. There's absolutely no reason for a limit below a few hundred (it's not documented as thread-safe, so you can't be doing it concurrently in the same process, so all you need to avoid is hitting process fd limit which defaults to 1024). Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.