Date: Tue, 21 Apr 2020 12:00:31 -0300 From: Adhemerval Zanella <adhemerval.zanella@...aro.org> To: Rich Felker <dalias@...c.org>, David Laight <David.Laight@...LAB.COM> Cc: 'Nicholas Piggin' <npiggin@...il.com>, "libc-dev@...ts.llvm.org" <libc-dev@...ts.llvm.org>, "libc-alpha@...rceware.org" <libc-alpha@...rceware.org>, "linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>, "musl@...ts.openwall.com" <musl@...ts.openwall.com> Subject: Re: Powerpc Linux 'scv' system call ABI proposal take 2 On 21/04/2020 11:39, Rich Felker wrote: > On Tue, Apr 21, 2020 at 12:28:25PM +0000, David Laight wrote: >> From: Nicholas Piggin >>> Sent: 20 April 2020 02:10 >> ... >>>>> Yes, but does it really matter to optimize this specific usage case >>>>> for size? glibc, for instance, tries to leverage the syscall mechanism >>>>> by adding some complex pre-processor asm directives. It optimizes >>>>> the syscall code size in most cases. For instance, kill in static case >>>>> generates on x86_64: >>>>> >>>>> 0000000000000000 <__kill>: >>>>> 0: b8 3e 00 00 00 mov $0x3e,%eax >>>>> 5: 0f 05 syscall >>>>> 7: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax >>>>> d: 0f 83 00 00 00 00 jae 13 <__kill+0x13> >> >> Hmmm... that cmp + jae is unnecessary here. > > It's not.. Rather the objdump was just mistakenly done without -r so > it looks like a nop jump rather than a conditional tail call to the > function that sets errno. > Indeed, the output with -r is: 0000000000000000 <__kill>: 0: b8 3e 00 00 00 mov $0x3e,%eax 5: 0f 05 syscall 7: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax d: 0f 83 00 00 00 00 jae 13 <__kill+0x13> f: R_X86_64_PLT32 __syscall_error-0x4 13: c3 retq And for x86_64 __syscall_error is defined as: 0000000000000000 <__syscall_error>: 0: 48 f7 d8 neg %rax 0000000000000003 <__syscall_error_1>: 3: 64 89 04 25 00 00 00 mov %eax,%fs:0x0 a: 00 7: R_X86_64_TPOFF32 errno b: 48 83 c8 ff or $0xffffffffffffffff,%rax f: c3 retq Different than musl, each architecture defines its own error handling mechanism (some embedded errno setting in syscall itself, other branches to a __syscall_error like function as x86_64). This is due most likely from the glibc long history. One of my long term plan is to just simplify, get rid of the assembly pre-processor, implement all syscall in C code, and set error handling mechanism in a platform neutral way using a tail call (most likely you do on musl). >> It is also a 32bit offset jump. >> I also suspect it gets predicted very badly. > > I doubt that. This is a very standard idiom and the size of the offset > (which is necessarily 32-bit because it has a relocation on it) is > orthogonal to the condition on the jump. > > FWIW a syscall like kill takes global kernel-side locks to be able to > address a target process by pid, and the rate of meaningful calls you > can make to it is very low (since it's bounded by time for target > process to act on the signal). Trying to optimize it for speed is > pointless, and even size isn't important locally (although in > aggregate, lots of wasted small size can add up to more pages = more > TLB entries = ...). I agree and I would prefer to focus on code simplicity to have a platform neutral way to handle error and let the compiler optimize it than messy with assembly macros to squeeze this kind of micro-optimizations. > >>>>> 13: c3 retq >>>>> >>>>> While on musl: >>>>> >>>>> 0000000000000000 <kill>: >>>>> 0: 48 83 ec 08 sub $0x8,%rsp >>>>> 4: 48 63 ff movslq %edi,%rdi >>>>> 7: 48 63 f6 movslq %esi,%rsi >>>>> a: b8 3e 00 00 00 mov $0x3e,%eax >>>>> f: 0f 05 syscall >>>>> 11: 48 89 c7 mov %rax,%rdi >>>>> 14: e8 00 00 00 00 callq 19 <kill+0x19> >>>>> 19: 5a pop %rdx >>>>> 1a: c3 retq >>>> >>>> Wow that's some extraordinarily bad codegen going on by gcc... The >>>> sign-extension is semantically needed and I don't see a good way >>>> around it (glibc's asm is kinda a hack taking advantage of kernel not >>>> looking at high bits, I think), but the gratuitous stack adjustment >>>> and refusal to generate a tail call isn't. I'll see if we can track >>>> down what's going on and get it fixed. >> >> A suitable cast might get rid of the sign extension. >> Possibly just (unsigned int). > > No, it won't. The problem is that there is no representation of the > fact that the kernel is only going to inspect the low 32 bits (by > declaring the kernel-side function as taking an int argument). The > external kill function receives arguments by the ABI, where the upper > bits of int args can contain junk, and the asm register constraints > for syscalls use longs (or rather an abstract syscall-arg type). It > wouldn't even work to have macro magic detect that the expressions > passed are ints and use hacks to avoid that, since it's perfectly > valid to pass an int to a syscall that expects a long argument (e.g. > offset to mmap), in which case it needs to be sign-extended. > > The only way to avoid this is encoding somewhere the syscall-specific > knowledge of what arg size the kernel function expects. That's way too > much redundant effort and too error-prone for the incredibly miniscule > size benefit you'd get out of it. > > Rich >
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.