Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Wed, 28 Oct 2020 16:50:29 -0700
From: Michael Forney <mforney@...rney.org>
To: musl@...ts.openwall.com
Subject: qemu-user and musl

Hi,

I'm trying to get various musl compatibility issues in qemu fixed
upstream. Here are the issues I've encountered:

1. To implement the timer_create syscall, qemu just translates
between host and target sigevent, then calls the libc timer_create.
For SIGEV_THREAD, both glibc and musl implement this using the
Linux-specific SIGEV_THREAD_ID. However, musl's timer_create does
not support SIGEV_THREAD_ID from the application, so it fails with
EINVAL. This means that any application using timers with SIGEV_THREAD
does not work with qemu-user running on musl.

This issue appears as a build time error due to the use of
glibc-internal _sigev_un._tid member during sigevent translation.
Most distributions patch this by introducing a `host_sigevent`
matching the glibc layout and translating to that instead of libc's
sigevent. For the reasons mentioned above, this does not actually
fix the problem.

I see there is a patch on the mailing list adding support for
SIGEV_THREAD_ID which looks ready to merge. Rich, since you're
working towards getting ready for the next musl release, do you
think this might make it in? That way, we could just patch qemu to
add

#ifndef sigev_notify_thread_id
#define sigev_notify_thread_id _sigev_un._tid
#endif

and replace _sigev_un._tid with sigev_notify_thread_id.

2. qemu uses long obsolete F_SHLCK and F_EXLCK in the translation
of struct flock between host and target, which musl does not define.
Most musl distributions patch qemu to define these constants itself
if they are missing, but seeing as these lock types are unsupported
by Linux since 2.2, I sent a patch to just drop them (you'll get
EINVAL either way).

3. qemu uses the following configure test to check for clock_adjtime:

#include <time.h>

int main(void)
{
	return clock_adjtime(0, 0);
}

However, musl declares clock_adjtime in sys/timex.h, as indicated
by the linux man page, while glibc only declares it in time.h (i.e.
it is not just a case of glibc implicitly including other headers).
So, including one or the other is not enough. A real application
will need both anyway for the clockid_t values and struct timex,
but this mismatch seems a bit strange and I'm not sure if it was
deliberate.

(of course, this is easy to fix by just adding an include of
<sys/timex.h> to the test, but I thought I'd mention it in case
there was anything actionable here)

4. Until recently qemu used the __SIGRTMIN and __SIGRTMAX macros
in its signal translation code. It looks like this was recently
refactored, and all that remains is a static assert:

QEMU_BUILD_BUG_ON(__SIGRTMAX + 1 != _NSIG);

I think the intention is to ensure that any possible value of
SIGRTMAX is smaller than _NSIG and will not index past past the end
of the host-to-target translation table. However, I think this is
a safe assumption and the assert can be dropped (or at least, wrapped
in #ifdef __SIGRTMAX).

Any suggestions for any of these proposed fixes are welcome.

-Michael

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.