Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Sun, 2 Feb 2014 01:50:02 -0500
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Re: Discussion of partly-invasive changes needed for x32 port

Here's an update after further consideration and discussion:

On Mon, Jan 20, 2014 at 02:41:31AM -0500, Rich Felker wrote:
> 1. System calls take 64-bit arguments, whereas the existing syscall
> framework in musl (and on the kernel side for all other archs/abis)
> uses long arguments.

The only open question here is where the type used for syscall
arguments, especially in syscall_cp, comes from: whether it's defined
in syscall_arch.h with a default fallback in internal/syscall.h, or a
public type from the installed headers.

Despite a public type being convenient for use in other headers (see
the ipc structures, etc.) I'm leaning towards putting the syscall
argument type in the private internal syscall headers. This at least
avoids nasty cross-dependency of the resolution of two separate
issues, and makes it so we can choose to provide or not provide a
public type independent of how the internals are implemented.

> 2. The kernel interprets the tv_nsec field of struct timespec as a
> kernel long (64-bit) rather than a userspace long (32-bit). glibc
> handles this completely wrong (bug #16437), making it easy for them.
> Doing it correctly is much harder; for all syscalls that take timespec
> arguments, musl will have to copy the timespec to a temp structure
> with the padding correctly sign-extended before passing it to the
> kernel. The affected functions include at least: sigtimedwait, ppoll,
> nanosleep, clock_settime, mq_timedsend, mq_timedreceive, pselect,
> futimesns, utimesnsat, and everything using timed futex waits
> (thankfully these all go through __timedwait in musl).
> 
> Adding the workaround code at every syscall point is ugly, and
> possibly error-prone. An alternate possible solution is hacking up the
> syscall macros in syscall_arch.h to detect "const struct timespec *"
> arguments and auto-wrap them with compound literals that would fix-up
> the padding. However this probably requires either C11 or GNU C
> extensions. On the positive side, it would only affect x32; no changes
> at all would be made to the source files or other archs'
> syscall_arch.h logic.

I've since determined this to be inadequate, since at least some
syscalls (futimens and utimensat) take pointers to an array of two
timespecs or other such ugliness.

The new direction I'd like to consider it adding special-case code in
syscall_arch.h that catches all the syscalls which take timespec
arguments and converts them. This is much less fragile and it could
also be used for (hypothetical, at this point, I think) interfaces
where a single timespec is used as both input and output to a syscall.

> 3. A number of other structures in the public headers differ from
> their userspace C type layout on other archs because they're setup to
> match the 64-bit kernel structs. This requires either changing some
> member types to match the 64-bit kernel ones (this is usually a bad
> idea, and sometimes even non-conforming like with tv_nsec) or adding
> adjacent padding members for the unused "high" part of the 64-bit
> value. In many cases, this requires either moving structures from the
> shared headers to bits/*, or coming up with clever ways to avoid doing
> so.
> 
> Most of the issues are in the sysv ipc headers:
> 
> [...]
> 
> All of these issues raise the question of whether it would be
> beneficial to have available "syscall_long_t" or similar in the public
> headers. This would make it very possible to avoid moving anything to
> bits (and perhaps even helo move some things back) with some mild to
> moderately ugly tricks for computed padding. But the cons are that
> exposing the type invites abuse (application code trying to use it)
> and the actual struct layouts perhaps become less explicit...

My leaning at this point is to just move whatever needs to be in bits
to bits and be done with it. While these few changes are rather ugly,
they're isolated to sysvipc headers, which are independent of anything
else and already pretty ugly anyway. I think spreading more of them
into bits is a lesser offense than exposing an internal type and
putting weird (nonstandard, internal) types, macros, #ifdefs, or
similar into the public non-bits headers where it would become
non-explicit what the actual structure layout is on the arch that's in
use. (How frustrating is it in glibc's headers when you see an odd
__-prefixed type used somewhere and you have to go hunting for the
actual definition elsewhere?) Of course there were some places where
just using time_t for a reserved/padding fields would get the job done
without moving anything to bits; I'm moderately ok with the approach
for now, with the understanding that it might have to change again
later.

So, if the above directions are non-controversial, I think the
remaining path for integrating x32 is:

1. Make changes to internal/syscall.h to use __scc macro, possibly
defined by syscall_arch.h, instead of (long) casts, and to expose
syscall_arg_t.

2. Making cancel_*.c use syscall_arg_t instead of long.

3. Moving to bits any sysvipc structures/types that will have to
differ on x32 but that haven't differed on archs supported so far.

4. Designing and implementing the timespec rewriting code in
syscall_arch.h for x32 (see comments on quoted item 2 above); this is
basically a switch(n) in the inline __syscallN functions to copy the
appropriate timespec arg to a local kernel timespec structure, and it
will get completely optimized out as long as n is constant.

5. Merge all of the x32 files, which should at this point be possible
without touching any non-x32 files.

6. Activate x32 support (e.g. adding it to configure).

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.