Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 25 Jun 2020 16:50:24 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Release prep for 1.2.1, and afterwards

On Thu, Jun 25, 2020 at 07:31:25PM +0200, Szabolcs Nagy wrote:
> * Rich Felker <dalias@...c.org> [2020-06-25 11:39:36 -0400]:
> 
> > On Thu, Jun 25, 2020 at 10:15:04AM +0200, Szabolcs Nagy wrote:
> > > * Rich Felker <dalias@...c.org> [2020-06-24 16:42:44 -0400]:
> > > 
> > > > I'm about to do last work of merging mallocng, followed soon by
> > > > release. Is there anything in the way of overlooked bug reports or
> > > > patches that should still be addressed in this release cycle?
> > > > 
> > > > Things I'm aware of:
> > > > 
> > > > - "Proposal to match behaviour of gethostbyname to glibc". Latest
> > > >   patch is probably ok, but could be deferred to after release.
> > > > 
> > > > - nsz's new sqrt{,f,l}. I'm hesitant to do all three right away
> > > >   without time to test, but replacing sqrtl.c could be appropriate
> > > >   since the current one is badly broken on archs with ld wider than
> > > >   double. However it would need to accept ld80 in order not to be
> > > >   build-breaking on m68k, or m68k would need an alternative.
> > > 
> > > that's still under work
> > 
> > Won't it work just to make it decode/encode the ldshape, and otherwise
> > use exactly the same code? Or are there double-rounding issues if the
> > quad code is used with ld80?
> 
> i think the same code may work for ld80 too,
> but i'm still testing the single/double/quad
> code, it's not ready for inclusion.

OK. I had in mind possibly adding just sqrtl.c since it can't really
be worse than what we have now. But I'm ok with waiting too.

One alternative to getting it working for ld80 right away would be
just adding an asm version of sqrtl for m68k. However we have users
who've indicated an interest in disabling asm optimizations (see
thread "build: allow forcing generic implementations of library
functions") so in the long term I think we should aim for all generic
math functions to work on all ld formats and FLT_EVAL_METHOD rather
than just assuming they get replaced on i386/x86_64 and m68k.

> > > but it would be nice if we could get the aarch64
> > > memcpy patch in (the c implementation is really
> > > slow and i've seen ppl compare aarch64 vs x86
> > > server performance with some benchmark on alpine..)
> > 
> > OK, I'll look again.
> 
> thanks.
> 
> (there are more aarch64 string functions in the
> optimized-routines github repo but i think they
> are not as important as memcpy/memmove/memset)

I found the code. Can you commend on performance and whether memset is
needed? (The C memset should be rather good already, moreso than
memcpy.)

As noted in the past I'd like to get rid of having high level flow
logic in the arch asm and instead have the arch provide string asm
fragments, if desired, to copy blocks, which could then be used in a
shared C skeleton. However as you noted this has been a point of
practical performance problem for a long time and I don't think it's
fair to just keep putting it off for a better solution.

Rich

Powered by blists - more mailing lists

Your e-mail address:

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.