Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 25 Jan 2016 11:22:13 -0800
From: Dan Gohman <sunfish@...illa.com>
To: musl@...ts.openwall.com
Subject: Re: Bits deduplication: current situation

Concerning stdint.h, there are a few details beyond just 32-bit vs 64-bit.
For example, int64_t can be either "long" or "long long" on an LP64 target.
The difference usually doesn't matter, but there are things which end up
noticing, like C++ name mangling and C format-string checking.

GCC >= 4.5 and clang predefine macros providing almost everything stdint.h
(and inttypes.h) needs. For example, see the attached file. Would you be
interested in a patch which refactors stdint.h to use this approach by
default, with a mechanism to support older compilers if needed?

Dan


On Sun, Jan 24, 2016 at 7:59 PM, Rich Felker <dalias@...c.org> wrote:

> I'm about to try starting the bits deduplication, but before getting
> started, I took a quick survey of the current bits headers we have:
>
>
> endian.h: We could have generic ones for little and big, but each arch
> that has subarchs with both endians needs its own custom version that
> tests the psABI-defined macro.
>
> errno.h: Almost all archs can share a generic errno.h. Those that
> don't might be able to share sub subset (thus benefiting from a more
> elaborate bits-header-gen system) but only a couple ugly archs are
> affected anyway.
>
> fcntl.h: Not sure how much these differ or how much they could share.
> Almost all archs' versions are unique now, but some may only have
> cosmetic differences.
>
> fenv.h: We can have a generic softfloat/no-fenv version, but each arch
> with hard float basically needs its own version.
>
> float.h: Only 3 generic versions should need to exist: ld64, ld80, and
> ld128(ieeequad).
>
> io.h: Most archs can use a generic empty file.
>
> ioctl.h: Varies highly but it may be possible to have generic versions
> (perhaps one 32-bit and one 64-bit) for the clean archs to share.
>
> ipc.h: Lots of trivial variations to account for kernel bugs in
> type/padding/etc. Not sure if they can be unified.
>
> limits.h: Varies by page size and 32/64-bit. Not sure if it makes
> sense to have generic versions; the logic to pick which one would be
> as large as the file. It would be nice to get the #ifdefs out of it
> though.
>
> mman.h: Seems to vary but differences may be mostly cosmetic; not
> sure.
>
> msg.h: Same deal as ipc.h.
>
> poll.h: Empty except for mips; generic definitions are in top-level
> poll.h now. With bits dedup we could move them to a generic bits file
> so that top-level doesn't have a nasty #ifndef.
>
> posix.h: Only 2 versions: ILP32 and LP64. They can be generic.
>
> reg.h: Completely arch-specific except in the case of multiple logical
> archs for the same ISA (x32).
>
> resource.h: Same deal as poll.h.
>
> sem.h: Same deal as ipc.h.
>
> setjmp.h: Arch-specific, same as reg.h.
>
> shm.h: Same deal as ipc.h.
>
> signal.h: Arch-specific, and currently omits siginfo_t which is
> gratuitously different on mips (and thus broken). Moving siginfo_t
> into it would add A LOT of duplication and maintenance burden unless
> we have an elaborate bits generation system that can piece these
> headers together from multiple parts so the siginfo_t part can be
> shared by all but mips.
>
> socket.h: The main difference is that workarounds for bogus kernel
> definitions of msghdr and cmsghdr are needed on 64-bit archs. A few
> archs also have their own definitions of some constants which override
> the top-level file's.
>
> stat.h: It varies a lot on current archs, but in principle there's a
> generic stat/stat64 that should be used for all new archs on the
> kernel side, so perhaps we could have a generic one for that.
>
> statfs.h: Mostly generic, but mips and x32 have quirks.
>
> stdarg.h: Not even used except with ancient/broken compilers. Same on
> all archs but i386 where the invalid legacy defs are provided.
> Probably should be dropped entirely.
>
> stdint.h: Purely a matter of 32 vs 64 bit, otherwise totally generic.
>
> syscall.h: Arch-specific except new kernel archs should use the
> generic one, which we can do as a generic.
>
> termios.h: Generic except for wacky archs (mips and powerpc).
>
> user.h: Highly arch-specific.
>
>
> The good news is that there are not a lot of places where there's
> value in doing anything elaborate with the deduplication. Just having
> a fixed ordered list of include dirs to search while building, and
> installation rules to pick the first matching one and install it in
> $(includedir)/bits, would probably work.
>
> It's possible that we could eliminate some bits headers entirely by
> having features.h (via a new bits/features.h) expose some parameters
> like endianness, ILP32-vs-LP64, etc. which the top-level headers could
> then use to define things in a non-arch-specific way. I'm not sure
> whether I like doing that though. It simplifies porting and header
> maintenance work, but at the cost of some explicitness whereby you can
> just open the header file (or the bits header file) and see how
> something is defined right away.
>
> A possible compromise is to highly abstract these things at the musl
> source level, but generate flat bits files to install, or even flatten
> the headers completely to remove bits so that all definitions are
> inline and explicit in the top-level headers.
>
> Ideas/requests/preferences/etc.?
>
> Rich
>

Content of type "text/html" skipped

View attachment "stdint-generic.h" of type "text/x-chdr" (2322 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.