Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 05 May 2020 18:37:41 -0500
From: sidneym@...eaurora.org
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: Hexagon DSP support

On 2020-04-30 18:51, Rich Felker wrote:
> On Thu, Apr 30, 2020 at 05:44:07PM -0500, sidneym@...eaurora.org wrote:
>> > -----Original Message-----
>> > From: 'Rich Felker' <dalias@...c.org>
>> > Sent: Wednesday, April 15, 2020 2:30 PM
>> > To: sidneym@...eaurora.org
>> > Cc: musl@...ts.openwall.com
>> > Subject: Re: [musl] Hexagon DSP support
>> >
>> > On Wed, Apr 15, 2020 at 02:12:12PM -0500, sidneym@...eaurora.org
> wrote:
>> > > src/api/ftw.c:44:1: error: switch condition has boolean value
>> > > [-Werror,-Wswitch-bool]
>> >
>> > Compiling libc-test with -Werror is invalid. It necessarily must
> produce
>> some
>> > warnings, and needs to be able to distinguish actual errors from them.
>> >
>> > > 8 errors generated.
>> > > src/functional/dlopen.c:39: dlsym main failed: Symbol not found:
> main
>> > > FAIL src/functional/dlopen.exe [status 1]
>> > > clang-11: warning: argument unused during compilation: '-rdynamic'
>> > > [-Wunused-command-line-argument]
>> >
>> > This is a clang deficiency that's preventing the test from working.
>> >
>> > > src/functional/ipc_msg.c:62: qid_ds.msg_lspid == 0 failed: got 100,
>> > > want 0
>> > > src/functional/ipc_msg.c:64: (long long)qid_ds.msg_stime == 0
> failed:
>> > > got 6815993655711498240, want 0
>> > > src/functional/ipc_msg.c:67: qid_ds.msg_ctime >= t failed: got
>> > > 70368744177664, want >= 154502684032321054
>> > > src/functional/ipc_msg.c:71: qid_ds.msg_qbytes > 0 failed: got 0,
> want
>> > > > 0
>> > > src/functional/ipc_msg.c:76: qid_ds.msg_qnum == 1 failed: got 0,
> want
>> > > 1
>> > > src/functional/ipc_msg.c:77: qid_ds.msg_lspid == getpid() failed:
> got
>> > > 100, want 185819
>> > > src/functional/ipc_msg.c:81: msg_stime is 6815993655711498240 want
> <=
>> > > 154502684032321059
>> > > src/functional/ipc_msg.c:128: child exit status: 256 FAIL
>> > > src/functional/ipc_msg-static.exe [status 1]
>> >
>> > This is the sysvipc bits headers issue I mentioned. Same with the
> sem/shm
>> > ones.
>> 
>> Passing with updated headers and qemu.
> 
> Looking at
> https://github.com/quic/musl/commit/b710ae08dfa8c0841abed44d55eee5e042321b
> d6
> there are new problems:
> 
> - Member mode in struct ipc_perm has wrong type. It must be mode_t.
>   This can be fixed just by removing the padding. But is there a
>   reason you're not using the generic arch-agnostic definition on both
>   the kernel and libc side?

Updated that type

> 
> - The definitions in bits/msg.h, bits/sem.h, and bits/shm.h are not
>   time64-compatible. In fact bits/sem.h is a regression; you had it
>   right before and broke it.
As you point out below the core of the problem was the missing time64
calls.  I've updated the structures to use what I hope or the correct 
types.

> 
> - The definition in bits/stat.h is using the wrong types (you need to
>   use libc type names for all the members; this is a policy
>   requirement to ensure that none inadvertently mismatch). Note that
>   there's not even really a need to use an arch-specific header here
>   anymore; the kernel definition comes from arch/hexagon/kstat.h and
>   the bits/stat.h can just be copied from aarch64 or something else
>   with a clean layout.

I used OpenRISC's stat.h.

> 
> Also math asm like sqrt.S is not acceptable for upstream.
> Arch-specific math files should only be for use of fpu instructions
> that perform the operation without the need for any high level flow
> control.
This is what I expected and I removed the file.

> 
>> > > src/functional/pthread_mutex.c:145: PTHREAD_MUTEX_ERRORCHECK relock
>> > > did not return EDEADLK, got deadlock
>> > > src/functional/pthread_mutex.c:148: PTHREAD_MUTEX_RECURSIVE relock
>> > did
>> > > not succed, got deadlock FAIL
> src/functional/pthread_mutex-static.exe
>> > > [status 1]
>> > > src/functional/pthread_mutex.c:145: PTHREAD_MUTEX_ERRORCHECK relock
>> > > did not return EDEADLK, got
>> > > src/functional/pthread_mutex.c:148: PTHREAD_MUTEX_RECURSIVE relock
>> > did
>> > > not succed, got deadlock FAIL src/functional/pthread_mutex.exe
> [status
>> > > 1]
>> >
>> > This suggests broken atomics. It doesn't look like anything qemu-user
>> could
>> > cause (unless it's just failing to emulate the atomics at all).
>> >
>> Updated QEMU to include time64 changes.  Passing now.
> 
> It should not be necessary to update qemu for time64 in order for this
> to work. Any change you made to qemu to make this work is almost
> certainly wrong. According to my reading of the upstream kernel, there
> is an existing stable user-kernel syscall ABI for hexagon, so you
> can't change any of the existing types or behavior of any of the
> existing syscalls.
> 
> Looking at your port, I think I've found the real core problem. In:
> 
> https://github.com/quic/musl/blob/b710ae08dfa8c0841abed44d55eee5e042321bd6
> /arch/hexagon/bits/syscall.h.in
> 
> I don't see any of the time64 syscalls. Having them defined is
> absolutely mandatory for an arch where the unadorned syscalls do not
> use 64-bit time_t. If you add them and revert all the incorrect
> changes, I think everything will work.

Adding those calls helped and cleared up a lot of confusion on my part.


> 
>> > Looks like something is wrong here, possibly wrong struct kstat, or it
>> could be a
>> > qemu bug.
>> 
>> It seems like QEMU's TARGET_NR_utimensat needs to changes similar to
> those
>> done for clock_get/settime to account for 64bit time_t.   I made some
>> changes but the last 2 checks at 65 and 66 still fail.
> 
> No. It does not need any change, and any change to it is wrong. It is
> implementing an existing ABI that's immutable and that's fine. What
> needs to change is that your syscall.h.in needs to define the values
> for the time64 syscalls. This will inform musl that the "plain"
> syscalls use 32-bit time_t and get you a correct build. Right now you
> just have a badly broken build where you've balanced brokenness in one
> place with corresponding brokenness in other places until the right
> thing comes out.

In QEMU I have had to update hexagon/target_structs.h (target_shmid_ds 
and target_semid64_ds) to reflect the size of time_t from abi_ulong 
types to abi_ullong types.

I've attached a new REPORT and the code changes are here:
https://github.com/quic/musl/compare/hexagon

Thanks for your help.

> 
>> > > src/math/ucb/sqrt.h:47: bad fp exception: RN
>> > > sqrt(0x1.fffffffffffffp+1023)=0x1.fffffffffffffp+511, want INEXACT
> got
>> > > 0 [...]
>> > > src/math/special/sqrt.h:74: bad fp exception: RN
>> > > sqrtl(0x1.9b294f88p-1015)=0x1.cad197e28e85bp-508, want INEXACT got 0
>> > > FAIL src/math/sqrtl.exe [status 1]
>> >
>> > These are likely all general (not arch-specific) clang floating point
>> bugs. We
>> > should see if they also happen compiling other archs without asm
> versions
>> of
>> > sqrt.
>> 
>> Assembly version of sqrt does pass.
> 
> Yes but writing a large complex function in asm is not reviewable or
> maintainable.
> 
> Rich

View attachment "REPORT" of type "text/plain" (253799 bytes)

Powered by blists - more mailing lists

Your e-mail address:

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.