Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 5 Sep 2022 18:38:30 +0200
From: Szabolcs Nagy <>
To: Paul Zimmermann <>
Subject: Re: Re: integration of CORE-MATH routines into Musl?

* Paul Zimmermann <> [2022-09-05 16:39:02 +0200]:

>        Dear Szabolcs,
> > when i worked on exp and log i noticed that for single prec it is
> > easy to do correct rounding with only minor overhead, but it required
> > either a bit bigger lookup table or a bit bigger polynomial vs going
> > for < 1 ulp error only.
> please have a look at no big lookup table, degree 5 only.

"a bit bigger".

in this case the polynomial is bigger: order 5 instead of 3.
(order 3 is enough for < 1 ulp error).

the code size is also bigger:

core-math: size -G (x86_64 -O3):

      text       data        bss      total filename
       464        352          0        816 exp2/exp2f.o
       398        348          0        746 exp/expf.o

musl: size -G: (data is shared between expf, exp2f and powf)

      text       data        bss      total filename
         0        328          0        328 exp2f_data.o
       202         12          0        214 exp2f.o
       211         16          0        227 expf.o

i'd expect at least a bit of overhead between <1 ulp and cr functions
(but not significant overhead in case of binary32). so when core-math
is faster, it should be possible to write an even faster version that
only aims to be <1 ulp (but the perf diff will not be huge).

in case of binary64: i'd expect one can turn a close to 0.5 ulp
implementation into a cr one with small overhead by testing for near
halfway cases in the end and having a slow path for those. but the
slow path will be much slower and bigger (and harder to test).

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.