Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 10 Jun 2017 22:20:44 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] s390x: Add single instruction math functions

On Sat, Jun 10, 2017 at 05:48:05PM -0400, David Edelsohn wrote:
> On Sat, Jun 10, 2017 at 5:44 PM, David Edelsohn <dje.gcc@...il.com> wrote:
> > On Sat, Jun 10, 2017 at 5:28 PM, Szabolcs Nagy <nsz@...t70.net> wrote:
> >> * David Edelsohn <dje.gcc@...il.com> [2017-06-10 16:22:24 -0400]:
> >>> On Sat, Jun 10, 2017 at 3:48 PM, Rich Felker <dalias@...c.org> wrote:
> >>> > On Sat, Jun 10, 2017 at 02:53:14PM -0400, David Edelsohn wrote:
> >>> >> On Sat, Jun 10, 2017 at 2:29 PM, Szabolcs Nagy <nsz@...t70.net> wrote:
> >>> >> > * David Edelsohn <dje.gcc@...il.com> [2017-06-10 13:25:00 -0400]:
> >>> >> >> On Sat, Jun 10, 2017 at 11:36 AM, Szabolcs Nagy <nsz@...t70.net> wrote:
> >>> >> >> > * David Edelsohn <dje.gcc@...il.com> [2017-06-09 10:51:25 -0400]:
> >>> >> >> >> The following patch is a start at single instruction math functions
> >>> >> >> >> for s390x architecture to increase performance.
> >>> >> >> >
> >>> >> >> > looks good, i wonder why gcc does not have builtins support for
> >>> >> >> > ceil, floor, nearbyint, round and trunc
> >>> >> >> >
> >>> >> >> > (on aarch64 the builtins expand to single instruction with
> >>> >> >> > -fno-math-errno, but on s390x they remain libc calls
> >>> >> >>
> >>> >> >> Both the functions and builtins are converted to single instructions
> >>> >> >> for me.  What architecture level is your GCC assuming?
> >>> >> >>
> >>> >> >
> >>> >> > i think it's the default s390x config
> >>> >> >
> >>> >> > $ s390x-linux-musl-gcc -v
> >>> >> > Using built-in specs.
> >>> >> > COLLECT_GCC=s390x-linux-musl-gcc
> >>> >> > COLLECT_LTO_WRAPPER=/home/nsz/w/mcm/output/bin/../libexec/gcc/s390x-linux-musl/6.3.0/lto-wrapper
> >>> >> > Target: s390x-linux-musl
> >>> >> > Configured with: ../src_toolchain/configure --enable-languages=c,c++ CFLAGS='-g0 -Os' CXXFLAGS='-g0 -Os' LDFLAGS=-s --disable-nls --with-debug-prefix-map=/home/nsz/w/mcm/build-s390x-linux-musl= --enable-languages=c,c++ --disable-libquadmath --disable-libquadmath-support --disable-decimal-float --disable-multilib --disable-libcilkrts --disable-libvtv --disable-libgomp --disable-libitm --disable-werror --target=s390x-linux-musl --prefix= --libdir=/lib --disable-multilib --with-sysroot=/s390x-linux-musl --enable-tls --disable-libmudflap --disable-libsanitizer --disable-gnu-indirect-function --disable-libmpx --enable-libstdcxx-time --with-build-sysroot=/home/nsz/w/mcm/build-s390x-linux-musl/obj_sysroot
> >>> >> > Thread model: posix
> >>> >> > gcc version 6.3.0 (GCC)
> >>> >> > $ cat a.c
> >>> >> > double f(double x)
> >>> >> > {
> >>> >> >         return __builtin_ceil(x);
> >>> >> > }
> >>> >> > $ s390x-linux-musl-gcc -O3 -fno-math-errno -S a.c -o -
> >>> >> >         .machinemode zarch
> >>> >> >         .machine "z900"
> >>> >>
> >>> >> Note the default architecture is z900 from 2005-2006.  The FP
> >>> >> instructions were added with the z196 processors in 2010.
> >>> >
> >>> > In that case the patch should probably have the code inside something
> >>> > like:
> >>> >
> >>> > #ifdef __Z196__ // or whatever the predef macro for the ISA level is
> >>> > // your code here
> >>> > #else
> >>> > #include "../foo.c"
> >>> > #else
> >>> >
> >>> > See src/math/arm/sqrt.c for a similar example.
> >>> >
> >>> >> s390x-linux-musl probably should default to a much newer processor
> >>> >> level, such as at least z196 or zEC12
> >>> >
> >>> > musl's policy is to just follow whatever ISA level the compiler is
> >>> > configured for; you can set this at musl build time with CFLAGS or use
> >>> > a default built into the toolchain at toolchain build time
> >>> > (--with-arch, I think).
> >>>
> >>> Musl already defaults to the later ISA in the rest of the s390x port.
> >>
> >> would it be hard to support all s390x isa levels?
> >
> > It's a waste of effort and will hurt performance on newer processors.

I don't think this is accurate. There is no asm in any
performance-relevant code paths now. If there were, it could be
conditionally built for the compiler's ISA level (-march) based on
predefined macros.

> > No user of Musl and Alpine is going to -- or even /can/ -- run it on
> > older processors.  All of the Docker containers and underlying Linux
> > distributions don't support the older processors.
> 
> When I worked with Bobby Bingham to create the s390x port of Musl, I
> said that he could assume newer processors.  Also, I don't believe
> that LLVM supports the earlier processors.  I believe that he assumed
> some more recent instructions in other parts of the code.

That seems doubtful; the amount of asm in musl is minimal and unlikely
to benefit from later ISA levels; all the instructions I see look like
very basic stuff that would always have been available.

Now, what likely is accurate is your claim that nobody is using musl
on lower ISA levels, so maybe it doesn't matter.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.