musl - Re: Considering x86-64 fenv.s to C

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200118052941.GW30412@brightrain.aerifal.cx>
Date: Sat, 18 Jan 2020 00:29:41 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Considering x86-64 fenv.s to C

On Sat, Jan 18, 2020 at 03:45:27PM +1100, Damian McGuckin wrote:
> >C specifies it as:
> >
> >   "The feraiseexcept function returns zero if the excepts argument
> >   is zero or if all the specified exceptions were successfully
> >   raised. Otherwise, it returns a nonzero value."
> 
> Pretty vague. This is not why the M68K routines return (-1).
> 
> No routine currently checks that the exceptions were successfully
> raised. They assume that a write to the status register works. If we
> are going to check each instruction such as storing into a register
> works, we have a lot of work to do.

If the ISA has instructions to set status word then we can assume they
actually work. The idea of the possibility that they can fail is to
admit different implementation choices: rather than declining to
define exceptions and rounding modes on softfloat ABI variants, we
could have defined them but make the functions to set them always
report failure. I think this would have been a worse choice which is
why it wasn't done.

In the future if softfloat implementations add fenv, but it's only
conditionally available at runtime depending on [something], then it
would make sense to have possibility of failure at runtime.

> >>	double to a union and then extract the data as a long.
> >>
> >>		return (union {double f; long i;}) {get_fpscr_f()}.i;
> >>
> >>	Is this style of coding universally accepted within MUSL? From my
> >>	reading of other routines, it is normally done as
> >>
> >>		union {double f; long i;} f = { get_fpscr_f() };
> >>
> >>		return f.i;
> >>
> >>	Just curious.
> >
> >Yes, the compound literal form is preferred since it avoids a
> >gratuitous named variable.
> 
> I would humbly suggest initially using the longer form, and then
> once the architecture of the routine is complete, we revert to the
> compound literal form where the code is simple enough to make it
> possible.

I don't follow why this is better.

> >>x86_64
> >>
> >>*	In assembler
> >>
> >>*	Why does 'feclearexcept' disrespect the flags by clearing ALL x86 bits?
> >
> >As you said above, updating x87 status register is expensive because
> >the only way to write it is to read-modify-write the whole fenv. But
> >since we know on x86_64 we have sse registers, we can just move all
> >the flags to the sse register, then use fnclex to clear the x87 one
> >inexpensively, and the effective set of raised flags remains the same.
> 
> Thanks for the explanation. Neat.
> 
> >I think we could do this on i386 too with a couple tricks:
> >
> >1. Do the same thing if sse is available (hwcap check).
> >
> >2. If sse is not available, clear all flags then re-raise the desired
> >set via arithmetic operations.
> 
> Simple. I like it. But more code.
> 
> Also, playing devil's advocate for a minute ....
> 
> Are we, or should we, be aiming to have
> 
> 	fetestexcept(int excepts)
> 
> and (even also)
> 
> 	feraiseexcept(int excepts)
> 
> being expanded inline so their use does not compromise optimization?

Only if LTO is in use, and then as you mention you may run into bugs
with the compiler not considering side effects of floating point
operations correctly, so LTO isn't really practical to use until
compilers are fixed. Otherwise we don't put implementations of stuff
like this in public headers.

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.