Date: Wed, 26 Feb 2020 21:00:51 -0800 From: Fangrui Song <i@...kray.me> To: Rich Felker <dalias@...c.org> Cc: musl@...ts.openwall.com Subject: Re: [PATCH] Add REL_COPY size change detection On 2020-02-26, Rich Felker wrote: >On Wed, Feb 26, 2020 at 08:53:03PM +0100, Florian Weimer wrote: >> * Rich Felker: >> >> > On Wed, Feb 26, 2020 at 07:38:31PM +0100, Florian Weimer wrote: >> >> * Rich Felker: >> >> >> >> > At the very least I think we ought to catch and error on the case >> >> > where def.sym->st_size>sym->st_size, since we can't honor it and >> >> > failure to honor it can produce silent memory corruption. I'm less >> >> > sure about what to do if def.sym->st_size<sym->st-size; this case >> >> > seems safe and might be desirable not to break (I vaguely recall an >> >> > intent that it be ok), but if you think there are reasons it's >> >> > dangerous I'm ok with disallowing it too. I'm having a hard time now >> >> > thinking of a reason it would really help to support that, anyway. >> >> >> >> Unfortunately the Mozilla NSS people disagree that size mismatches for >> >> global symbols are an ABI break. I don't know if this is relevant in >> >> the musl context, but it means that for glibc, we probably can't make >> >> it a hard error. >> >> >> >> I want to have better diagnostics for this in glibc, but the current >> >> warning (which is poorly worded at that) is in the >> >> architecture-specific code, and I got side-tracked when I tried to >> >> clean this up the last time. >> > >> > Thanks for the feedback. Do you have a source where we could read more >> > about this? What non-broken behavior do they expect to get when sizes >> > don't match? +1 for a `def.sym->st_size!=sym->st-size` diagnostic. >> There's an NSS bug report: >> >> <https://bugzilla.mozilla.org/show_bug.cgi?id=1201900> >> >> It seems that the NSS situation is better than what I remembered. > >Good to know. There may be instances where the code takes the address of a global variable but does not actually care about the contents (st_size does not matter). > >> > As an aside, I think we should be encouraging distros that are using >> > PIE to get rid of copy relocations by passing whatever options are >> > needed (or building gcc with whatever options are needed) to avoid >> > emitting them in PIE. IIRC I looked this up once but I can't remember >> > what I found. >> >> If I recall correctly, the optimization was a factor when rolling out >> PIE-by-default in Fedora. I do not know if we can revert it without >> switching back to fixed-address builds. > >I think this is almost surely premature optimization. In almost all >cases, if there's software where the performance impact makes a >difference it can be avoided by giving the affected global data >objects visibility of hidden (if it's not used outside the main >program anyway) or protected (if it needs to be externally visible). >But on x86_64 and aarch64, and to some extent on 32-bit arm as well, >the performance difference of accessing globals via the got vs >pc-relative is negligible. clang has an option -mpie-copy-relocations (the name could be improved), which enables direct access (usually PC-relative) for -fPIE. // -target aarch64 -fPIE -O3 adrp x8, :got:var ldr x8, [x8, :got_lo12:var] ldr w0, [x8] ret // -target aarch64 -fPIE -O3 -mpie-copy-relocations adrp x8, var ldr w0, [x8, :lo12:var] ret On x86, with R_X86_64_[REX_]GOTPCRELX, the option is still beneficial. // -O3 -fPIE a.c -Wa,--mrelax-relocations=yes 0: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 7 <foo+0x7> 3: R_X86_64_REX_GOTPCRELX var-0x4 7: 8b 00 mov (%rax),%eax 9: c3 retq // -O3 -fPIE a.c -mpie-copy-relocations 0: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # 6 <foo+0x6> 2: R_X86_64_PC32 var-0x4 6: c3 retq Users of -mpie-copy-relocations may compile their applications in a mostly statically linking mode, with only a few definitions in DSOs. If some variables are unfortunately really external, R_*_COPY will be produced by the linker. >> Even if we did that, the ABI incompatibility will still be there. > >Yes. But fixing it would avoid any bugs from fallout of the full >object not being copyable at runtime in any newly build programs. > >> There is also a similar truncation issue for TLS variables, I think. > >TLS variables never use copy relocations, except for a short period of >time on riscv64, where thankfully it was realized to be a mistake and >reverted. So I don't think this issue applies to TLS. > >Rich https://sourceware.org/bugzilla/show_bug.cgi?id=23825 (GCC 10 riscv should have fixed this.) (glibc/sysdeps/riscv/dl-machine.h has a hack supporting R_RISCV_COPY on STT_TLS. No other known ld.so supports it.)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.