Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 7 Dec 2021 15:29:21 -0500
From: Rich Felker <dalias@...c.org>
To: Markus Wichmann <nullplan@....net>
Cc: Florian Weimer <fweimer@...hat.com>, musl@...ts.openwall.com
Subject: Re: [PATCH] ppc64: check for AltiVec in setjmp/longjmp

On Tue, Dec 07, 2021 at 09:15:05PM +0100, Markus Wichmann wrote:
> On Tue, Dec 07, 2021 at 08:28:28PM +0100, Florian Weimer wrote:
> > We do have source code for one implementation.
> >
> > | -- bcl 20,31,$+4 is special case.  not a subroutine call, used to get next instruction address, should not be placed on link stack.
> > | iu4_bo_d( 6 to 10)              <= iu3_instr_pri( 6 to 10);
> > | iu4_bi_d(11 to 15)              <= iu3_instr_pri(11 to 15);
> > |
> > | iu4_getNIA                      <= iu4_opcode_q(0 to 5)         = "010000"      and
> > |                                    iu4_bo_q(6 to 10)            = "10100"       and
> > |                                    iu4_bi_q(11 to 15)           = "11111"       and
> > |                                    iu4_bd(EFF_IFAR'left to 61)  = 1             and
> > |                                    iu4_aa_q                     = '0'           and
> > |                                    iu4_lk_q                     = '1'           ;
> >
> > <https://github.com/openpower-cores/a2i/blob/96299300abca65a074c635204a163e10569ee9b7/rel/src/vhdl/work/iuq_bp.vhdl#L880>
> >
> > I suspect “iu4_bd(EFF_IFAR'left to 61) = 1” matches 4 exactly (the
> > lowest four bits of the offset are not encoded in the instruction
> > because they are always zero).  But I don't know any VHDL.
> >
> 
> Me neither but I do recognize a few of those words. The opcode obviously
> refers to the most significant six bits, encoding the primary opcode,
> and "bo", "bi", "bd", "aa", and "lk" are what the PPC books call the
> various fields of this particular instruction (that being "bc", branch
> conditional). So this matches exactly the "+4" form of the instruction
> discussed so far.

Thanks for digging this up!

> BTW, musl's PPC code contains a few more instances of getting NIA with
> "bl", in the CRT code and in GETFUNCSYM() at least.  So if we're
> spending this much time finding out the optimal way to get the NIA, we
> should probably do the same there, for consistency if nothing else.

In general I would prefer the "obvious what it's doing" form over the
"special cased for performance" form in places where performance can't
matter -- for example, the ones you cited that execute once per
program invocation. But if it's easy to read either way, fine -- and
it probably can be made so.

Note that if the __hwcap-. constant is moved out of line, I think it's
possible to avoid any added cost. Something along the lines of the
following:

	bcl 20,31,1f
1:	mflr 4
	lwz 5,2f-1b(4)
	lwzx 4,4,5
	...
2:	.long __hwcap-1b

Does this look right?

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.