|
|
Message-ID: <5C60D05C95724A36B3DB9942D06CFE5F@H270>
Date: Wed, 11 Aug 2021 00:53:37 +0200
From: "Stefan Kanthak" <stefan.kanthak@...go.de>
To: "Szabolcs Nagy" <nsz@...t70.net>
Cc: <musl@...ts.openwall.com>
Subject: Re: [PATCH] Properly simplified nextafter()
Szabolcs Nagy <nsz@...t70.net> wrote:
>* Stefan Kanthak <stefan.kanthak@...go.de> [2021-08-10 08:23:46 +0200]:
>> <https://git.musl-libc.org/cgit/musl/plain/src/math/nextafter.c>
>> has quite some superfluous statements:
>>
>> 1. there's absolutely no need for 2 uint64_t holding |x| and |y|;
>> 2. IEEE-754 specifies -0.0 == +0.0, so (x == y) is equivalent to
>> (ax == 0) && (ay == 0): the latter 2 tests can be removed;
>
> you replaced 4 int cmps with 4 float cmps (among other things).
and hinted that the result of the second pair of comparisions is
already known from the first pair.
> it's target dependent if float compares are fast or not.
It's also target dependent whether the floating-point registers
can be accessed by integer instructions, or need to be copied:
some win, some loose!
Just let the compiler/optimizer do its job!
> (the i386 machine where i originally tested this preferred int
> cmp and float cmp was very slow in the subnormal range and
> iirc it also raises the non-standard input denormal exception,
> which is fine i guess.
This exception resp. the (sticky) flag is explicitly raised/set
in the part following the patch.
> of course soft float abis much prefer int cmp so your code is
> likely much slower and bigger there).
0. Doesn't musl provide target specific routines for targets with
soft FP?
1. If not: the compiler knows the target ABI and SHOULD generate
the proper integer comparisions there.
> but i'm not against the change, it is likely better on modern
> machines. did you try to benchmark it? or check the code size?
I STILL don't run a system supported by musl.
The code is of course smaller ... but not as small and fast as a
proper i386 or AMD64 assembly implementation ... which I can
post upon request.
regards
Stefan
>> 3. there's absolutely no need to compare the signs of x and y
>> with the sign of the direction: its sufficient to test that
>> direction and sign of x match;
>> 4. a proper compiler/optimizer should be able to reuse the results
>> of the comparision (x == y) for (x < y) or (x > y) and
>> (x == 0.0) for (x < 0.0) or (x > 0.0).
>>
>> JFTR: if ((x < 0.0) == (x < y)) is equivalent to
>> if ((x > 0.0) == (x > y))
>>
>> --- -/src/math/nextafter.c
>> +++ +/src/math/nextafter.c
>> @@ -3,20 +3,15 @@
>> double nextafter(double x, double y)
>> {
>> union {double f; uint64_t i;} ux={x}, uy={y};
>> - uint64_t ax, ay;
>> int e;
>>
>> if (isnan(x) || isnan(y))
>> return x + y;
>> - if (ux.i == uy.i)
>> + if (x == y)
>> return y;
>> - ax = ux.i & -1ULL/2;
>> - ay = uy.i & -1ULL/2;
>> - if (ax == 0) {
>> - if (ay == 0)
>> - return y;
>> + if (x == 0.0)
>> ux.i = (uy.i & 1ULL<<63) | 1;
>> - } else if (ax > ay || ((ux.i ^ uy.i) & 1ULL<<63))
>> + else if ((x < 0.0) == (x < y))
>> ux.i--;
>> else
>> ux.i++;
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.