Date: Tue, 7 Dec 2021 08:33:44 +0000 From: "Quesada Gonzalez, Elena" <elena.quesada_gonzalez@...mens.com> To: "musl@...ts.openwall.com" <musl@...ts.openwall.com> Subject: List-Unsubscribe -----Mensaje original----- De: David Edelsohn <dje.gcc@...il.com> Enviado el: martes, 7 de diciembre de 2021 2:45 Para: Rich Felker <dalias@...c.org> CC: musl@...ts.openwall.com; Florian Weimer <fweimer@...hat.com>; Stijn Tintel <stijn@...ux-ipv6.be> Asunto: Re: [musl] [PATCH] ppc64: check for AltiVec in setjmp/longjmp On Mon, Dec 6, 2021 at 8:39 PM Rich Felker <dalias@...c.org> wrote: > > On Mon, Dec 06, 2021 at 08:15:48PM -0500, David Edelsohn wrote: > > On Mon, Dec 6, 2021 at 7:59 PM Rich Felker <dalias@...c.org> wrote: > > > > > > On Tue, Dec 07, 2021 at 01:37:12AM +0100, Florian Weimer wrote: > > > > * Stijn Tintel: > > > > > > > > > diff --git a/src/setjmp/powerpc64/setjmp.s > > > > > b/src/setjmp/powerpc64/setjmp.s index 37683fda..32853693 > > > > > 100644 > > > > > --- a/src/setjmp/powerpc64/setjmp.s > > > > > +++ b/src/setjmp/powerpc64/setjmp.s > > > > > @@ -69,7 +69,17 @@ __setjmp_toc: > > > > > stfd 30, 38*8(3) > > > > > stfd 31, 39*8(3) > > > > > > > > > > - # 5) store vector registers v20-v31 > > > > > + # 5) store vector registers v20-v31 if hardware supports AltiVec > > > > > + mflr 0 > > > > > + bl 1f > > > > > + .hidden __hwcap > > > > > + .long __hwcap-. > > > > > +1: mflr 4 > > > > > > > > This de-balances the return stack and probably has quite severe > > > > performance impact. The ISA manual says to use > > > > > > > > bcl 20,31,$+4 > > > > > > > > and you'll have to store the __hwcap offset somewhere else. > > > > > > To begin with, let's change the .s files to .S files and put the > > > whole branch logic inside #ifndef __ALTIVEC__ so that it does not > > > impact normal builds with an ISA level where Altivec can be > > > assumed to be present. > > > > > > I'm not sufficiently familiar with the PowerPC ISA to know how bcl > > > works, but if there's a less expensive solution along those lines > > > that's compatible with all ISA levels, by all means let's use it. > > > The same could be done for powerpc-sf (32-bit) and its SPE branches, too. > > > > bl = branch and link > > bcl = branch conditional and link > > > > link means place the next instruction address in the link register. > > Normally a branch and link would be used for a matching "return" > > instruction, but in this case it is being used to compute a position > > independent code address. As Florian correctly points out, the "bl" > > will corrupt the link stack in the processor used to predict return > > addresses and the recommended sequence is the one that he suggests. > > > > bcl 20,31,addr > > > > which means branch always and, because the condition register bits > > are irrelevant, a special value that instructs the processor to not > > push the address onto the link stack so that the "calls" and "returns" > > remain matched. > > Thanks. Am I correct in understanding then that we don't need $+4, but > can instead use the 1f just as now, with inline .long __hwcap-. -- in > other words that "bcl 20,31," is a drop-in replacement for "bl" > without the link stack impact? It should work, but it's slightly preferred to use $+4 because one explicitly wants the address of the next instruction and labels of the form "1f" are not supported by all assemblers. Thanks, David
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.