musl - Re: Cross-compiling test flow for libc-test

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20191031131926.GL16318@brightrain.aerifal.cx>
Date: Thu, 31 Oct 2019 09:19:26 -0400
From: Rich Felker <dalias@...c.org>
To: Ruinland ChuanTzu Tsai <ruinland@...estech.com>
Cc: musl@...ts.openwall.com, alankao@...estech.com
Subject: Re: Cross-compiling test flow for libc-test

On Thu, Oct 31, 2019 at 02:04:06PM +0800, Ruinland ChuanTzu Tsai wrote:
> Hi,
> sorry for sending this email out of the blue.
> 
> I'm wondering whether there are any (official) guides about how to do
> cross-compiling tests for libc-test ?
> 
> If I understand the Makefile of libc-test correctly, it will compile
> test units and then execute those tests on _host_ platform right away.
> 
> Somehow, whilst I was cross-testing glibc, there's a script,
> cross-test-ssh.sh, which could be used with `test-wrapper` hook to ex-
> ecute those freshly compiled test units on a hetero-architecture platf-
> orm via ssh connections.
> 
> If there's no such mechanism for libc-test at this time being, then I'm
> willing to develope one.
> 
> That being said, I'm curious about how's the attitude which maintainers
> take toward this kind of testing flow.

I don't have any sort of automated CI setup I use at present.
Generally I run or ask others for results from running tests on archs
when changes potentially affecting them have been made, and I do this
myself on x86 and sometimes a few other archs, especially before
releases where any invasive change has been made.

Ability to auto-deploy and run tests across a number of archs would be
nice. I don't think anyone has a sufficient set of physical machines
setup for this, and that's a big barrier to entry, so what might be a
better setup is launching the tests via qemu system-level emulation,
setting up a virtfs root to pass into each guest and from which to
read the output.

Note: qemu user-level emulation will necessarily fail lots of the
tests due to problems with how qemu emulates (or rather doesn't; it
just passes them thru) signals and how that interacts with thread
cancellation. But system-level should be good, and could even let you
test different setups that will exercise different code paths in musl
like enabling/disabling vdso, or old kernels lacking new syscalls.

One thing I would encourage for any automated/CI type testing is
layering. Rather than trying to make libc-test into an
infrastructure-dependent framework itself, use it as a tool that gets
pulled and used.

> Aside from cross-testing, I also wonder the status of testing reports
> for releases on currently supported CPU architectures.
> As I was running libc-test on x86_64, some of functional and regression
> tests fail.

Here's a list of what I expect to fail. It should probably be turned
into a page on the wiki or documented somewhere more visible than
this mailing list post:

api/main fails due to some confstr and pathconf item macros we don't
yet define (we were waiting on glibc to assign numbers so we could
align them).

functional/strptime fails due to unimplemented functionality.

functional/utime may fail due to time_t being 32-bit or kernel lacking
time64 support.

math/* may fail due to very minor status flag or precision issues.

musl/pleval fails in the dynamic-linked version only because it
references an internal-use symbol which is hidden in modern musl; the
static version successfully gets the symbol.

regression/malloc-brk-fail fails conditionally on kernel behavior; the
failure is not a problem in musl or the kernel but rather in the test
code's ability to setup the right VM space state needed to perform the
actual test, which is hard to do.

Otherwise, any failures are unexpected, I think.

> Is there a validating rule (e.g. funcional/xxx and regression/yyy must 
> pass) for code checking-in which I can enforce locally before
> submitting patches here ?

You could turn the above into a rule, but for most things the coverage
is not sufficient to tell you that a change is likely-valid. Since
musl source is not highly coupled, generally you'll at bet get
indication of problem from test files that are testing the specific
component you modified, or where the test setup itself depends on
functionality you modified. At present I think the tests are most
valuable for:

1. preparing ports to new archs, where errors in bits headers or other
   arch-specific files are often caught by something not working.

2. documenting conformance subtleties and ways to detect them, to
   avoid regressions if the relevant component is modified and to
   expose related bugs in other libc implementations and get them
   fixed.

But if you're working on (modifying, or just reading) a component that
doesn't seem to have test coverage, writing and submitting tests would
be very helpful.

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.