musl - Moving forward with sh2/nommu

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20150601151107.GA20759@brightrain.aerifal.cx>
Date: Mon, 1 Jun 2015 11:11:07 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Cc: rob@...dley.net
Subject: Moving forward with sh2/nommu

[resent to musl list]

Here's a summary of the issues we need to work through to get a modern
SH2/nommu-targetted musl/toolchain out of the proof-of-concept stage
and to the point where it's something people can use roughly 'out of
the box':

Kernel issues:

1. Kernel should support loading plain ELF directly, unmodified. Right
   now I'm writing 0x81 to byte 38 of the header to make a "non-FDPIC
   FDPIC ELF binary", which works, but I have to make a personality()
   syscall at startup to switch back (this matters to kernel signal
   handling) and it's just highly inconvenient/ugly.

   Despite plain ELF being suitable for NOMMU, the loader
   implementation in binfmt_elf.c depends pretty heavily on MMU. The
   one in binfmt_elf_fdpic.c can work on either. The easiest way
   forward is to make it so that binfmt_elf_fdpic.c does not insist on
   having the FDPIC flags in the ELF header on NOMMU targets (where it
   won't confict with binfmt_elf.c since that loader isn't usable).

2. Kernel insists on having a stack size set in the PT_GNU_STACK
   program header; if it's 0 (the default ld produces) then execve
   fails. It should just provide a default, probably 128k (equal to
   MMU-ful Linux).

3. Kernel uses the stack for brk too, growing brk from the opposite
   end. This is horribly buggy/dangerous. Just dummying out brk to
   always-fail is what should be done, but if it can't be done on the
   kernel side musl can do it instead (that's what I'm doing now).
   Unfortunately I suspect fixing this might be controversial since
   there may be existing binaries using brk that can't fall back to
   mmap.

4. Syscall trap numbers differ on SH2 vs SH3/4. Presumably the reason
   is that these two SH2A hardware traps overlap with the syscall
   range used by SH3/4 ABI:

	#  define TRAP_DIVZERO_ERROR  17
	#  define TRAP_DIVOVF_ERROR   18

   The path forward I'd like to see is deprecating everything but trap
   numbers 22 and 38, which, as far as I can tell, are safe for both
   the SH2 and SH3/4 kernel to treat as syscalls. These numbers
   indicate "6 arguments"; there is no good reason to encode the
   number of arguments in the trap number, so we might as well just
   always use the "6 argument" code which is what the variadic
   syscall() has to use anyway. User code should then aim to use the
   correct value (22 or 38) for the model it's running on (SH3/4 or
   SH2) for compatibility with old kernels, but will still run safely
   on new kernels if it detects wrong.

Toolchain issues:

1. We need static-PIE (with or without TEXTRELs) which gcc does not
   support out of the box. I have complex command lines that produce
   static-PIE, and I have specfile based recipes to convert a normal
   toolchain to produce (either optionally or by default) static-PIE,
   but these recipes conflict with using the same toolchain to build
   the kernel. If static-PIE were integrated properly upstream that
   would not be an issue.

2. Neither binutils nor gcc accepts "sh2eb-linux" as a target. Trying
   to hack it in got me a little-endian toolchain. I'm currently just
   using "sheb" and -m2 to use sh2 instructions that aren't in sh1.

3. The complex math functions cause ICE in all gcc versions I've tried
   targetting SH2. For now we can just remove src/complex from musl,
   but that's a hack. The cause of this bug needs to be found and
   fixed in GCC.

musl issues:

1. We need runtime detection for the right trap number to use for
   syscalls. Right now I've got the trap numbers hard-coded for SH2 in
   my local tree.

2. We need additional runtime detection options for atomics: interrupt
   masking for plain SH2, and the new CAS instruction for SH2J.

3. We need sh/vfork.s since the default vfork.c just uses fork, which
   won't work. I have a version locally but it doesn't make sense to
   commit without runtime trap number selection.

4. As long as we're using the FDPIC ELF header flag to get
   binfmt_elf_fdpic.c to load binaries, the startup code needs to call
   the personality() syscall to switch back. I have a local hack for
   doing this in rcrt1.o which is probably not worth upstreaming if we
   can just make the kernel do it right.

5. The brk workaround I'm doing now can't be upstreamed without a
   reliable runtime way to distinguish nommu. To put it in malloc.c
   this would have to be a cross-arch solution. What might make more
   sense is putting it in syscall_arch.h for sh, where we already
   have to check for SH2 to determine the right trap number; the
   inline syscall code can just do if (nr==SYS_brk&&IS_SH2) return 0;
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.