Date: Wed, 24 Apr 2019 22:01:08 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: [PATCH] x86: optimize fp_arch.h On Thu, Apr 25, 2019 at 01:51:06AM +0200, Szabolcs Nagy wrote: > tested on x86_64 and i386 > >From 5f97370ff3e94bea812ec123a31d7482965a3b1b Mon Sep 17 00:00:00 2001 > From: Szabolcs Nagy <nsz@...t70.net> > Date: Wed, 24 Apr 2019 23:29:05 +0000 > Subject: [PATCH] x86: optimize fp_arch.h > > Use fp register constraint instead of volatile store when sse2 math is > available, and use memory constraint when only x87 fpu is available. > --- > arch/i386/fp_arch.h | 31 +++++++++++++++++++++++++++++++ > arch/x32/fp_arch.h | 25 +++++++++++++++++++++++++ > arch/x86_64/fp_arch.h | 25 +++++++++++++++++++++++++ > 3 files changed, 81 insertions(+) > create mode 100644 arch/i386/fp_arch.h > create mode 100644 arch/x32/fp_arch.h > create mode 100644 arch/x86_64/fp_arch.h > > diff --git a/arch/i386/fp_arch.h b/arch/i386/fp_arch.h > new file mode 100644 > index 00000000..b4019de2 > --- /dev/null > +++ b/arch/i386/fp_arch.h > @@ -0,0 +1,31 @@ > +#ifdef __SSE2_MATH__ > +#define FP_BARRIER(x) __asm__ __volatile__ ("" : "+x"(x)) > +#else > +#define FP_BARRIER(x) __asm__ __volatile__ ("" : "+m"(x)) > +#endif I guess for float and double you need the "m" constraint to ensure that a broken compiler doesn't skip dropping of precision (although I still wish we didn't bother with complexity to support that, and just relied on cast working correctly), but at least for long double couldn't we use an x87 register constraint to avoid the spill to memory? Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.