Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Thu, 18 Jan 2024 10:54:44 -0500
From: Rich Felker <>
To: Leah Neukirchen <>
Cc:, enh <>
Subject: Re: preadv2/pwritev2

On Wed, Jan 17, 2024 at 03:08:29PM +0100, Leah Neukirchen wrote:
> Rich Felker <> wrote:
> > On Tue, Oct 19, 2021 at 07:24:26PM -0700, enh wrote:
> > > i've recently added preadv2(2) and pwritev2(2) wrappers to bionic, since we
> > > had our first real prospective user come along, and they're mildly annoying
> > > to use via syscall(3). unfortunately, this particular user also wants to be
> > > able to compile for the host, and our glibc is years out of date, and our
> > > current plan is to move to musl for the host[1].
> > > 
> > > anyway ... musl doesn't have preadv2/pwritev2. i couldn't see any
> > > discussion on the mailing list, so i thought i'd ask whether this is just
> > > because no-one's got round to it yet, or there's some policy[2] i'm not
> > > aware of, or what? happy to send a patch if it's just a case of "we haven't
> > > got round to/had a need for it yet".
> > > 
> > > ____
> > > 1. TL;DR: being able to statically link without worrying about licensing is
> > > very enticing, and gets us out of a lot of the compatibility issues we have
> > > that made our last glibc update more trouble than it was worth, and means i
> > > have no intention of getting us embroiled in another glibc update.
> > > 2. i've been maintaining bionic for years now, and don't think i've written
> > > down our policy explicitly. this was definitely a borderline case from the
> > > "number of users" perspective, but for me the "annoying to use with
> > > syscall(2)" tipped me over the edge into adding them. amusingly [or not,
> > > depending on how you feel about "bugs you get away with"], it also made me
> > > realize that our pread/pwrite implementations for LP64 were wrong in that
> > > they weren't zeroing the unused half of the register pair. so that was a
> > > bonus :-)
> > 
> > There is high level policy for decision-making process for
> > inclusion/exclusion. For new sycalls that are "safe" to use directly
> > via syscall() it's not terribly urgent to take any action, but some
> > like these would benefit from being cancellation points, which makes
> > them somewhat compelling. If we do add them, I want to make sure we
> > don't conflict with glibc's way of exposing them to applications (if
> > they have one yet) -- things like the function signatures and how the
> > flags are exposed. None of this looks hard to get right though. So I
> > think it should be pretty straightforward to add these.
> Bumping this, as bcachefs-tools now uses pwritev2.
> glibc wraps the syscall with a cancellation point and also tries to
> fall back to pwritev/writev when flags is zero and the original call
> failed with ENOSYS.  Vice versa for preadv2.
> I didn't bother with the fallback since the call is there since Linux 4.6:
> ssize_t pwritev2(int fd, const struct iovec *iov, int count, off_t ofs, int flags)
> {
> 	return syscall_cp(SYS_pwritev2, fd, iov, count,
> 		(long)(ofs), (long)(ofs>>32), flags);
> }

"Since Linux 4.6" isn't really an indication for not needing fallback.

Normally I'd say check !flags first and just use the old syscall, but
here I'm not really sure. Eventually it's going to be the other way
around -- pwrite() needs to be implemented in terms of SYS_pwritev2
once my patch "vfs: add RWF_NOAPPEND flag for pwritev2" is upstream,
because currently it's dangerously misbehaving. At that point I'm not
sure what the right thing to do with the Linux-specific pwritev()
would be, but my leaning would be that it should behave like
flags==RWF_NOAPPEND rather than flags==0.

When that's done, plain pwrite() (and possibly pwritev?) will have to
probe for O_APPEND flag in the fallback case if SYS_pwritev2 returns
-ENOSYS, and issue some kind of error in that case (which it really
should already be doing), so there's a mess of stuff to be done here.

I'm not saying we have to solve all this now, just as context. I think
it will be future-proof if we just use the raw SYS_pwritev syscall in
the !flags case either before or after attempting SYS_pwritev2.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.