Date: Tue, 7 Dec 2021 15:29:21 -0500 From: Rich Felker <dalias@...c.org> To: Markus Wichmann <nullplan@....net> Cc: Florian Weimer <fweimer@...hat.com>, musl@...ts.openwall.com Subject: Re: [PATCH] ppc64: check for AltiVec in setjmp/longjmp On Tue, Dec 07, 2021 at 09:15:05PM +0100, Markus Wichmann wrote: > On Tue, Dec 07, 2021 at 08:28:28PM +0100, Florian Weimer wrote: > > We do have source code for one implementation. > > > > | -- bcl 20,31,$+4 is special case. not a subroutine call, used to get next instruction address, should not be placed on link stack. > > | iu4_bo_d( 6 to 10) <= iu3_instr_pri( 6 to 10); > > | iu4_bi_d(11 to 15) <= iu3_instr_pri(11 to 15); > > | > > | iu4_getNIA <= iu4_opcode_q(0 to 5) = "010000" and > > | iu4_bo_q(6 to 10) = "10100" and > > | iu4_bi_q(11 to 15) = "11111" and > > | iu4_bd(EFF_IFAR'left to 61) = 1 and > > | iu4_aa_q = '0' and > > | iu4_lk_q = '1' ; > > > > <https://github.com/openpower-cores/a2i/blob/96299300abca65a074c635204a163e10569ee9b7/rel/src/vhdl/work/iuq_bp.vhdl#L880> > > > > I suspect “iu4_bd(EFF_IFAR'left to 61) = 1” matches 4 exactly (the > > lowest four bits of the offset are not encoded in the instruction > > because they are always zero). But I don't know any VHDL. > > > > Me neither but I do recognize a few of those words. The opcode obviously > refers to the most significant six bits, encoding the primary opcode, > and "bo", "bi", "bd", "aa", and "lk" are what the PPC books call the > various fields of this particular instruction (that being "bc", branch > conditional). So this matches exactly the "+4" form of the instruction > discussed so far. Thanks for digging this up! > BTW, musl's PPC code contains a few more instances of getting NIA with > "bl", in the CRT code and in GETFUNCSYM() at least. So if we're > spending this much time finding out the optimal way to get the NIA, we > should probably do the same there, for consistency if nothing else. In general I would prefer the "obvious what it's doing" form over the "special cased for performance" form in places where performance can't matter -- for example, the ones you cited that execute once per program invocation. But if it's easy to read either way, fine -- and it probably can be made so. Note that if the __hwcap-. constant is moved out of line, I think it's possible to avoid any added cost. Something along the lines of the following: bcl 20,31,1f 1: mflr 4 lwz 5,2f-1b(4) lwzx 4,4,5 ... 2: .long __hwcap-1b Does this look right? Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.