Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 24 Apr 2019 22:01:08 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] x86: optimize fp_arch.h

On Thu, Apr 25, 2019 at 01:51:06AM +0200, Szabolcs Nagy wrote:
> tested on x86_64 and i386

> >From 5f97370ff3e94bea812ec123a31d7482965a3b1b Mon Sep 17 00:00:00 2001
> From: Szabolcs Nagy <nsz@...t70.net>
> Date: Wed, 24 Apr 2019 23:29:05 +0000
> Subject: [PATCH] x86: optimize fp_arch.h
> 
> Use fp register constraint instead of volatile store when sse2 math is
> available, and use memory constraint when only x87 fpu is available.
> ---
>  arch/i386/fp_arch.h   | 31 +++++++++++++++++++++++++++++++
>  arch/x32/fp_arch.h    | 25 +++++++++++++++++++++++++
>  arch/x86_64/fp_arch.h | 25 +++++++++++++++++++++++++
>  3 files changed, 81 insertions(+)
>  create mode 100644 arch/i386/fp_arch.h
>  create mode 100644 arch/x32/fp_arch.h
>  create mode 100644 arch/x86_64/fp_arch.h
> 
> diff --git a/arch/i386/fp_arch.h b/arch/i386/fp_arch.h
> new file mode 100644
> index 00000000..b4019de2
> --- /dev/null
> +++ b/arch/i386/fp_arch.h
> @@ -0,0 +1,31 @@
> +#ifdef __SSE2_MATH__
> +#define FP_BARRIER(x) __asm__ __volatile__ ("" : "+x"(x))
> +#else
> +#define FP_BARRIER(x) __asm__ __volatile__ ("" : "+m"(x))
> +#endif

I guess for float and double you need the "m" constraint to ensure
that a broken compiler doesn't skip dropping of precision (although I
still wish we didn't bother with complexity to support that, and just
relied on cast working correctly), but at least for long double
couldn't we use an x87 register constraint to avoid the spill to
memory?

Rich

Powered by blists - more mailing lists

Your e-mail address:

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.