Date: Wed, 8 Dec 2021 10:43:05 +0200 From: Stijn Tintel <stijn@...ux-ipv6.be> To: Rich Felker <dalias@...c.org> Cc: musl@...ts.openwall.com Subject: Re: [PATCH] ppc64: check for AltiVec in setjmp/longjmp On 7/12/2021 02:59, Rich Felker wrote: > On Tue, Dec 07, 2021 at 01:37:12AM +0100, Florian Weimer wrote: >> * Stijn Tintel: >> >>> diff --git a/src/setjmp/powerpc64/setjmp.s b/src/setjmp/powerpc64/setjmp.s >>> index 37683fda..32853693 100644 >>> --- a/src/setjmp/powerpc64/setjmp.s >>> +++ b/src/setjmp/powerpc64/setjmp.s >>> @@ -69,7 +69,17 @@ __setjmp_toc: >>> stfd 30, 38*8(3) >>> stfd 31, 39*8(3) >>> >>> - # 5) store vector registers v20-v31 >>> + # 5) store vector registers v20-v31 if hardware supports AltiVec >>> + mflr 0 >>> + bl 1f >>> + .hidden __hwcap >>> + .long __hwcap-. >>> +1: mflr 4 >> This de-balances the return stack and probably has quite severe >> performance impact. The ISA manual says to use >> >> bcl 20,31,$+4 >> >> and you'll have to store the __hwcap offset somewhere else. > To begin with, let's change the .s files to .S files and put the whole > branch logic inside #ifndef __ALTIVEC__ so that it does not impact > normal builds with an ISA level where Altivec can be assumed to be > present. > > I'm not sufficiently familiar with the PowerPC ISA to know how bcl > works, but if there's a less expensive solution along those lines > that's compatible with all ISA levels, by all means let's use it. The > same could be done for powerpc-sf (32-bit) and its SPE branches, too. > > Also the add and lwz can be used into lwzx (indexed load). > The code for ppc64 uses ld after add, not lwz. This is required to make it work on both big and little endian systems. We therefore cannot use lwzx, but have to use ldx. Stijn
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.