Date: Sat, 25 Nov 2017 19:49:19 -0500 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: Do not use 64 bit division if possible On Sun, Nov 26, 2017 at 01:10:15PM +1300, Michael Clark wrote: > > > > On 26/11/2017, at 12:53 PM, Rich Felker <dalias@...c.org> wrote: > > > > On Sun, Nov 26, 2017 at 12:46:56AM +0100, David Guillen Fandos wrote: > >> Thanks for your response. > >> Please note that PAGE_SIZE is not a constant but an alias to > >> libc.page_size which is a variable of type size_t (signed). > >> That's why at O1+ gcc doesn't generate a shift. > > > > Indeed; this varies by arch. > > Oh, I wasn’t aware of that. > > >> I also created a patch to include libc.page_shift, but as far as I > >> can see no other functions would benefit from it, since there's no > >> other divides there (only negations, additions and subtractions). > > > > Adding infrastructure complexity except in cases where it makes a > > significant improvement to size or performance is generally not > > desirable. mmap() is one other place where, in principle, division by > > PAGE_SIZE might take place, but in practice the size is constant 4096 > > or 8192 on all archs. > > > >> And yeah I agree, a_ctz_l is not exactly inexpensive but I guess it > >> is better than full 64 bit signed division (that's why I cast > >> unsigned otherwise the shift right is not trivial due to the sign). > > > > The cost here is more a matter of adding a reading complexity > > dependency on musl internals (a_*) where it's not needed. I wonder if > > GCC could optimize it if we instead of /PAGE_SIZE wrote > > /(PAGE_SIZE&-PAGE_SIZE). Or if we did something like define PAGE_SIZE > > as ((libc.page_size&-libc.page_size)==libc.page_size ? libc.page_size > > : 1/0) so that "PAGE_SIZE is not a power of 2" would become an > > unreachable case. > > Interesting. It seems GCC figures out the division by zero is unreachable but the (n&-n) expression leads to a power of two, not to a log2 n so the ctz is still required. > > - https://cx.rv8.io/g/eHf2Ah > > One could do so once at initialisation time and add PAGE_SHIFT and on architectures with variable page sizes do this: > > #define PAGE_SHIFT libc.page_shift > > diff --git a/src/env/__libc_start_main.c b/src/env/__libc_start_main.c > index 2d758af..f24d10a 100644 > --- a/src/env/__libc_start_main.c > +++ b/src/env/__libc_start_main.c > @@ -29,6 +29,7 @@ void __init_libc(char **envp, char *pn) > __hwcap = aux[AT_HWCAP]; > __sysinfo = aux[AT_SYSINFO]; > libc.page_size = aux[AT_PAGESZ]; > + libc.page_shift = a_ctz_l(libc.page_size); > > if (!pn) pn = (void*)aux[AT_EXECFN]; > if (!pn) pn = ""; > > That isolates the a_ctz_l to one place. Is there a reason it makes a difference? The operation involves a syscall so the cost of a division is going to be dominated by the syscall. If you're calling this repeatedly/in a loop, your program is going to be super slow with or without the division. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.