Date: Wed, 3 Apr 2019 12:13:18 +1100 (AEDT) From: Damian McGuckin <damianm@....com.au> To: musl@...ts.openwall.com Subject: Floating Point Accuracy Hi, On Sat, 2 Feb 2019, Szabolcs Nagy wrote: > correct rounding is not practical for the libc. (i think it can have > similar average performance to an almost-cr implementation, but the > worst case will be 100x slower and it will require many big tables. > glibc used to be an exception: it had several double precision functions > impemented with correct rounding in nearest rounding mode, but they were > hard to maintain and nobody liked the >10000x latency spikes, so it was > decided long ago that cr should not be the goal of glibc. While I understand how hard it is to do CR, what are the current accuracy goals for LIBC? Are they documented? Specifically does it guarantee under 1.5*ULP or under 2*ULP I have read some things which say ALWAYS under 1*ULP which I used to believe. But that is just wrong. Is this documented anywhere for MUSL please? Or, is there a test suite to check for this? > meanwhile the ts 18661-4 spec introduced separate cr prefixed names for > correctly rounded math functions so if glibc ever gets cr > implementations again they will use different names.) Yes. I need to keep on top of these experimental C libraries. Is MUSL on the formal committe for this at all? Just curious. > my new implementations are generic c code, and it's possible to get > portable results with the fp semantics of c across targets which > implement annex F with FLT_EVAL_METHOD==0: you have to turn fp > contraction off, disable builtins and other compiler smartness (musl > already does these) and remove any target specific ifdefs or asm that > may introduce non-portable results (in the new code only the > __FP_FAST_FMA checks). Sometimes I curse FLT_EVAL_METHOD and sometimes I adore it. How is GLIBC built in this context or does this vary from Linux to Linux and AIX and Solaris? How do most people build/use MUSL? As somebody who often uses libraries from say Chapel or even Fortran, anything other than the use of FLT_EVAL_METHOD==0 can lead to interesting results. > that said, in practice there may be many reasons why a libc does not > provide completely portable behaviour: > > - long double cannot be made portable since the format is different > across targets. Besides the new Power9 CPUs and IBM's big iron, what other CPUs support 128-bit IEEE754 arithmetic and integers in hardware. Also, same question about support at the assembler level in softtware such as the Sparc CPUs. Any plan for that on ARMs that you can share? You mention, > - int rounding instructions can also make a significant difference > in performance or quality if non-nearest rounding modes need to be > supported (e.g. i currently don't have a satisfactory solution for > >= double precision trigonometric argument reduction on targets > that have no rounding mode independent nearest toint instruction). > the fdlibm code is broken in non-nearest rounding modes, the fix i > have is expensive and breaks useful properties, Is this documented in any of the MUSL routines? I would be interested to know the details. > such fix is not necessary on some targets if we allow different results. > > - i don't know other instructions that make a big difference across > targets, but new ones may come up (e.g. 1/x and 1/sqrt estimate > instructions may be useful in some algorithms, Until they are universal, they just complicate things. Do we need portable table lookups to provide these solutions to the masses? > or target specific cutoff values depending on faster int vs fp cmp) Playing with 'remquo' a while ago, I was amazed at the speed of an integer comparison of 2 significands. It was so much faster than an FP equivalent. > and then someone may want to provide target specific optimizations and > the libc has to evaluate the maintenance cost vs benefits. > there are no universal tests for math functions, i still maintain > detailed special case checks in libc-test, Where is that located please? Is MUSL used much on hardware like the TI Signal Processing chips which have only 32-bit IEEE. That chip does have a non-IEEE754 extended float which is 40-bit which is just 8 extra bits in the significand. Thanks again - Damian Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037 Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here Views & opinions here are mine and not those of any past or present employer
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.