Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 27 Mar 2016 22:18:56 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH 2/2] add powerpc64 port

On Sun, Mar 27, 2016 at 07:32:20PM -0500, Bobby Bingham wrote:
> On Sun, Mar 27, 2016 at 07:37:09PM -0400, Rich Felker wrote:
> > On Sun, Mar 27, 2016 at 04:20:19PM -0500, Bobby Bingham wrote:
> > > diff --git a/arch/powerpc64/bits/endian.h b/arch/powerpc64/bits/endian.h
> > > new file mode 100644
> > > index 0000000..2016cb2
> > > --- /dev/null
> > > +++ b/arch/powerpc64/bits/endian.h
> > > @@ -0,0 +1,5 @@
> > > +#if __BIG_ENDIAN__
> > > +#define __BYTE_ORDER __BIG_ENDIAN
> > > +#else
> > > +#define __BYTE_ORDER __LITTLE_ENDIAN
> > > +#endif
> > 
> > This should probably use whatever the official psABI macro for PPC BE
> > is rather than __BIG_ENDIAN__ (which I think is a confusingly-named
> > GCC thing; it's not clear whether it means "is big endian" or "value
> > that indicates big endian").
> 
> The ABI spec is at https://members.openpowerfoundation.org/document/dl/576.
> Section 5.1.4 documents __BIG_ENDIAN__ and __LITTLE_ENDIAN__ for this
> purpose.

OK, __BIG_ENDIAN__ it is then.

> > > +} mcontext_t;
> > > +
> > > +#else
> > > +
> > > +typedef struct {
> > > +	long __regs[4+4+48+33+1+34+34+32+1];
> > > +} mcontext_t;
> > > +
> > > +#endif
> > 
> > Did you check that the layout of the namespace-safe and full mcontex_t
> > match?
> 
> Yes, see the breakdown above.

OK, looks good. I might mildly prefer using int and void * for the
actual slots of those types, but from a practical standpoint it
doesn't matter.

> > > +#define SA_NOCLDSTOP  1U
> > > +#define SA_NOCLDWAIT  2U
> > > +#define SA_SIGINFO    4U
> > > +#define SA_ONSTACK    0x08000000U
> > > +#define SA_RESTART    0x10000000U
> > > +#define SA_NODEFER    0x40000000U
> > > +#define SA_RESETHAND  0x80000000U
> > > +#define SA_RESTORER   0x04000000U
> > 
> > Is there a reason for making these unsigned? It's different from other
> > archs at least, I think.
> 
> It's the same as the ppc32 port.

OK. Maybe it should be changed but that's a separate issue.

> > > diff --git a/arch/powerpc64/crt_arch.h b/arch/powerpc64/crt_arch.h
> > > new file mode 100644
> > > index 0000000..0605511
> > > --- /dev/null
> > > +++ b/arch/powerpc64/crt_arch.h
> > > @@ -0,0 +1,21 @@
> > > +__asm__(
> > > +".text \n"
> > > +".global " START " \n"
> > > +".type   " START ", %function \n"
> > > +START ": \n"
> > > +"	addis 2, 12, .TOC.-" START "@ha \n"
> > > +"	addi  2,  2, .TOC.-" START "@l \n"
> > 
> > What does this do? It looks like canonical function prologue of some
> > sort, but _start is not a C function and unless r12 is part of the ELFv2
> > entry point ABI that I'm not aware of, I don't think it's meaningful.
> > AFAICT r2 is not subsequently used.
> 
> In the ABI, C functions have two entry points, the "global" entry point,
> and the "local" entry point.  The local entry point expects the "table of
> contents" pointer in r2.  The global entry point expects the address of
> global entry point itself in r12, uses the prologue I used here to
> calculate the TOC pointer from it, and falls through to the local entry
> point.  See section 2.3.2.1 of the ABI spec linked above.
> 
> Calls within the same module are resolved to the local entry point, so
> we need to ensure r2 is set up with the TOC pointer here to call any C
> code.
> 
> The ABI spec does specify that r12 contains the address of _start,
> specifically so this prologue can work (section 4.1.2.1).

OK. In that case the subsequent code to compute the address of
_DYNAMIC via PC-relative addressing using the return address is
unnecessary, though; you already have the initial PC and can just
compute _DYNAMIC relative to it.

Also, for dynamic-linked programs, the main program's entry point is
reached via CRTJMP() from the dynamic linker, and your definition is
not setting up r12. In order to meet this part of the ELF ABI it needs
to set r12.

One thing that's not clear to me is whether the jumps/calls from asm,
as you've written them, go to the local entry point or the global
entry point of the callee. Since you're not loading r12 I assume the
intent is to go to the local entry point, but that only works if the
callee is hidden. Or does the PLT take care of converting that for you
(as long as r2 is set)?

One place that looks wrong to me is sigsetjmp. I would expect r2 to be
lost/clobbered when setjmp is called, but maybe not since it looks
like setjmp saves it in the jmp_buf, despite it not being a call-saved
register. Is that your expectation?

> > > diff --git a/arch/powerpc64/reloc.h b/arch/powerpc64/reloc.h
> > > new file mode 100644
> > > index 0000000..8e60b31
> > > --- /dev/null
> > > +++ b/arch/powerpc64/reloc.h
> > > @@ -0,0 +1,32 @@
> > > +#include <endian.h>
> > > +
> > > +#if __BYTE_ORDER == __LITTLE_ENDIAN
> > > +#define ENDIAN_SUFFIX "le"
> > > +#else
> > > +#define ENDIAN_SUFFIX ""
> > > +#endif
> > > +
> > > +#define LDSO_ARCH "powerpc64" ENDIAN_SUFFIX
> > 
> > Is it intentional that the "default" subarch variant be suffixed with
> > "le" and a non-default/unused one be the bare "powerpc64"? I don't
> > object to that but it's contrary to usual conventions that the bare
> > arch be the canonical ABI, and it might be contrary to what GCC is
> > doing now (haven't checked) for the dynamic linker name.
> 
> I'll double check, but I skimmed the gcc code here, and it looked like
> it uses an "le" suffix.  Admittedly, I didn't read it closely enough to
> be absolutely sure yet.

OK, sounds good. Also note that CRTJMP here needs fixing.

> > > diff --git a/arch/powerpc64/syscall_arch.h b/arch/powerpc64/syscall_arch.h
> > > new file mode 100644
> > > index 0000000..aee11eb
> > > --- /dev/null
> > > +++ b/arch/powerpc64/syscall_arch.h
> > > @@ -0,0 +1,5 @@
> > > +#define __SYSCALL_LL_E(x) (x)
> > > +#define __SYSCALL_LL_O(x) (x)
> > > +
> > > +#undef SYSCALL_NO_INLINE
> > > +#define SYSCALL_NO_INLINE
> > 
> > Is this just unfinished or is there a reason inline syscalls don't
> > work well?
> 
> Just unfinished.  ppc32 is the same here.  I can add these.

OK.

> > > diff --git a/src/signal/powerpc64/sigsetjmp.s b/src/signal/powerpc64/sigsetjmp.s
> > > new file mode 100644
> > > index 0000000..ce59b60
> > > --- /dev/null
> > > +++ b/src/signal/powerpc64/sigsetjmp.s
> > > @@ -0,0 +1,30 @@
> > > +	.global sigsetjmp
> > > +	.global __sigsetjmp
> > > +	.type sigsetjmp,%function
> > > +	.type __sigsetjmp,%function
> > > +	.hidden ___setjmp
> > > +sigsetjmp:
> > > +__sigsetjmp:
> > > +	addis 2, 12, .TOC.-__sigsetjmp@ha
> > > +	addi  2,  2, .TOC.-__sigsetjmp@l
> > > +	.localentry sigsetjmp,.-sigsetjmp
> > > +	.localentry __sigsetjmp,.-__sigsetjmp
> > 
> > Again I don't see what the purpose of these insns is; if the resulting
> > value is needed, are you aware of how that interacts with ___setjmp
> > returning twice?
> 
> This sets up r2 with the TOC pointer, as is required by the ABI in order
> to call setjmp's local entry point.  Since setjmp is also written in asm,
> we could do away with this here.
> 
> I don't think the fact that setjmp returns twice matters for this.

When setjmp returns the second time, all registers it did not save
have been clobbered (by arbitrary code that ran after the first return
from setjmp). However despite not being a call-saved register
(AFAICT), r2 is saved by setjmp, so it's probably okay.

> > It looks to me like the assumption is that r12 contains the address of
> > the callee at the time of a function call, but I don't see how that's
> > satisfied in the calls I've seen.
> 
> The instructions before the .localentry directive are the global entry
> point.  The ABI requires that r12 be the address of the callee whenever
> the global entry point of a function is called.

*nod*

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.