Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 24 Jan 2020 08:44:02 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Considering x86-64 fenv.s to C

On Fri, Jan 24, 2020 at 05:08:54PM +1100, Damian McGuckin wrote:
> On Thu, 23 Jan 2020, Rich Felker wrote:
> 
> >Note that it's not necessary to write that logic in asm just because
> >it's in the arch-defined part. It could simply be C that fixes up the
> >value before/after the asm.
> 
> My rule was to have no logic in the embedded assembler. Just get or
> set the register and quit. Your skill levels may let you do more. I
> tried to keep these routine super simple.  Here they are for the
> PowerPC.
> 
> 	static inline double get_fpscr_f(void)
> 	{
> 		double fpscr;
> 
> 		__asm__ __volatile__ ("mffs %0" : "=d"(fpscr));
> 		return fpscr;
> 	}
> 	static inline unsigned int get_csr(void)
> 	{
> 		return (unsigned int) (union {double f; long i;}) {get_fpscr_f()}.i;
> 	}
> 	static inline void set_fpscr_f(double fpscr)
> 	{
> 		__asm__ __volatile__ ("mtfsf 255, %0" : : "d"(fpscr));
> 	}
> 	static inline void set_csr(unsigned int fpscr)
> 	{
> 		set_fpscr_f((union {long i; double f;}) {(long) fpscr}.f);
> 	}

I would drop the _f functions; they're not useful separately and
without them you can do something like:

	static inline unsigned get_fpscr_f(void)
	{
		union { double f; long i; } fpscr;
		__asm__ __volatile__ ("mffs %0" : "=d"(fpscr.f));
		return fpscr.i;
	}

that seems both shorter and easier to follow.

> >But indeed the following may be nicer anyway:
> .......
> >>	if (excepts & FE_INVALID)
> >>		excepts |= FE_ALL_INVALID;
> >>and
> >>	if (excepts & FE_INVALID)
> >>		excepts |= FE_INVALID_SOFTWARE;
> >>
> >>An optimize should see the OR with zero for most architectures and
> >>then ignore that whole 'if' statement because the action was a
> >>NO-OP.
> >
> >Yes, this looks nice (although I would probably include FE_INVALID
> >in FE_ALL_INVALID just out of principle of least surprise; it
> >should generate the same code either way.
> 
> Do you mean
> 
>  	if (excepts & FE_INVALID)
>  		excepts |= FE_ALL_INVALID | FE_INVALID;
> 
> If so, will it really be optimized away if FE_ALL_INVALID is zero (0)?

No, I meant so that FE_ALL_INVALID is defined as FE_INVALID rather
than as 0 by default (when the arch doesn't define it). I don't think
it makes any difference to the code emitted, but the value 0 is rather
confusing with the name being "ALL". Without knowing to expect
otherwise, I read:

#ifndef FE_ALL_INVALID
#define FE_ALL_INVALID 0
#endif

as assuming there are no invalid flags at all (not even FE_INVALID) if
the arch didn't define FE_ALL_INVALID. Whereas I read:

#ifndef FE_ALL_INVALID
#define FE_ALL_INVALID FE_INVALID
#endif

as assuming FE_INVALID is the only invalid flag if if the arch didn't
define otherwise.

> >feclearexcept would do:
> >
> >	soft_fpsr = (soft_fpsr | get_sr()) & ~except;
> >	clear_sr();
> >
> >At that point, FE_INVALID is clearable independent of the
> >specific-reason flags and doesn't seem to need any special treatment.
> 
> ??? - I need to wrap my head around that one over the weekend.

The underlying idea is that it makes no difference whether the flag
bits are in the hard or soft fpsr. The logical value is
soft_sr|get_sr(). What the above does is move the entire logical value
to soft_fpsr so that the value in hard_fpsr can be cleared.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.