Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 2 Oct 2019 07:40:45 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: armv7-m segv when throwing c++ exception

On Wed, Oct 02, 2019 at 11:22:33AM +0200, Szabolcs Nagy wrote:
> * Patrick Oppenlander <patrick.oppenlander@...il.com> [2019-10-02 12:00:19 +1000]:
> > Hi all,
> > 
> > I'm running into an issue with a gcc-8.3.0/musl-1.1.23 toolchain built
> > using musl-cross-make for armv7-m.
> > 
> > % cat test.cpp
> > int main()
> > {
> >         throw 0;
> > }
> > % armv7m-linux-musleabihf-g++ -ggdb -static test.cpp
> > % qemu-arm ./a.out
> > qemu: uncaught target signal 11 (Segmentation fault) - core dumped
> > [2]    198246 segmentation fault (core dumped)  qemu-arm ./a.out
> > 
> > We should be seeing something like "terminate called after throwing an
> > instance of 'int'".
> 
> there is at least one known issue in this area:
> libgcc uses pthread apis via weak refs so with static
> linking they don't get linked in, however i think that
> should only affect unwinding in multi-thread processes,
> so this is something else
> 
> > I have an ODROID-C2 here which is a convenient place to run gdb:
> > 
> > $ gdb -q ./a.out
> > Reading symbols from ./a.out...
> > (gdb) run
> > Program received signal SIGSEGV, Segmentation fault.
> > 0xf77cf9f4 in __cxxabiv1::__cxa_throw (obj=0xf77f5fa0,
> > tinfo=0xf77efb20 <typeinfo for int>, dest=0x0)
> >     at ../../../../src_gcc/libstdc++-v3/libsupc++/eh_throw.cc:81
> > 81 ../../../../src_gcc/libstdc++-v3/libsupc++/eh_throw.cc: No such
> > file or directory.
> > (gdb) bt
> > #0  0xf77cf9f4 in __cxxabiv1::__cxa_throw (obj=0xf77f5fa0,
> > tinfo=0xf77efb20 <typeinfo for int>, dest=0x0)
> >     at ../../../../src_gcc/libstdc++-v3/libsupc++/eh_throw.cc:81
> > #1  0xf77cf902 in main () at test.cpp:3
> > (gdb) disas
> > Dump of assembler code for function __cxxabiv1::__cxa_throw(void*,
> > std::type_info*, void (*)(void*)):
> >    0xf77cf9e0 <+0>: push {r3, r4, r5, r6, r7, lr}
> >    0xf77cf9e2 <+2>: mov r5, r0
> >    0xf77cf9e4 <+4>: mov r7, r1
> >    0xf77cf9e6 <+6>: mov r6, r2
> >    0xf77cf9e8 <+8>: bl 0xf77d01c4 <__cxxabiv1::__cxa_get_globals()>
> >    0xf77cf9ec <+12>: mov r4, r0
> >    0xf77cf9ee <+14>: mov r1, r7
> >    0xf77cf9f0 <+16>: mov r2, r6
> >    0xf77cf9f2 <+18>: mov r0, r5
> > => 0xf77cf9f4 <+20>: ldr r3, [r4, #4]
> >    0xf77cf9f6 <+22>: adds r3, #1
> >    0xf77cf9f8 <+24>: str r3, [r4, #4]
> >    0xf77cf9fa <+26>: bl 0xf77cf984
> > <__cxxabiv1::__cxa_init_primary_exception(void*, std::type_info*, void
> > (*)(void*))>
> >    0xf77cf9fe <+30>: movs r3, #1
> >    0xf77cfa00 <+32>: mov r4, r0
> >    0xf77cfa02 <+34>: str.w r3, [r4], #40
> >    0xf77cfa06 <+38>: mov r0, r4
> >    0xf77cfa08 <+40>: bl 0xf77d88a8 <___Unwind_RaiseException>
> >    0xf77cfa0c <+44>: mov r0, r4
> >    0xf77cfa0e <+46>: bl 0xf77d01e0 <__cxxabiv1::__cxa_begin_catch(void*)>
> >    0xf77cfa12 <+50>: bl 0xf77cfaf0 <std::terminate()>
> > End of assembler dump.
> > (gdb) p/x $r4
> > $1 = 0x1
> > 
> > That doesn't look good.
> > 
> > (gdb) disas __cxa_get_globals
> > Dump of assembler code for function __cxxabiv1::__cxa_get_globals():
> >    0xf77d01c4 <+0>: ldr r0, [pc, #12] ; (0xf77d01d4
> > <__cxxabiv1::__cxa_get_globals()+16>)
> >    0xf77d01c6 <+2>: push {r3, lr}
> >    0xf77d01c8 <+4>: add r0, pc
> >    0xf77d01ca <+6>: bl 0xf77da940 <__tls_get_addr>
> >    0xf77d01ce <+10>: ldr r3, [pc, #8] ; (0xf77d01d8
> > <__cxxabiv1::__cxa_get_globals()+20>)
> >    0xf77d01d0 <+12>: add r0, r3
> >    0xf77d01d2 <+14>: pop {r3, pc}
> >    0xf77d01d4 <+16>: ; <UNDEFINED> instruction: 0001fe40
> >    0xf77d01d8 <+20>: ; <UNDEFINED> instruction: 00000000
> > End of assembler dump.
> 
> interesting, i thought the linker would relax __tls_get_addr
> calls in static linked executables to local-exec access model,
> but maybe that's not implemented on arm.
> 
> can you look at the readelf -aW a.out and see if the linker
> left any dynamic tls relocs in the exe?
> 
> > 
> > OK, so what did __tls_get_addr return?
> > 
> > (gdb) b *(__cxa_get_globals + 10)
> > Breakpoint 3 at 0xf77d01ce: file
> > ../../../../src_gcc/libstdc++-v3/libsupc++/eh_globals.cc, line 62.
> > (gdb) run
> > Breakpoint 3, 0xf77d01ce in __cxxabiv1::__cxa_get_globals () at
> > ../../../../src_gcc/libstdc++-v3/libsupc++/eh_globals.cc:62
> > 62 ../../../../src_gcc/libstdc++-v3/libsupc++/eh_globals.cc: No such
> > file or directory.
> > (gdb) p/x $r0
> > $3 = 0x1
> > 
> > That's doesn't look good either. Let's look at __tls_get_addr.
> > 
> > void *__tls_get_addr(tls_mod_off_t *v)
> > {
> >         pthread_t self = __pthread_self();
> >         return (void *)(self->dtv[v[0]] + v[1]);
> > }
> > 
> > (gdb) disas __tls_get_addr (annotations added by me)
> > Dump of assembler code for function __tls_get_addr:
> > r2 = v[0]
> >    0x0000c940 <+0>: ldr r2, [r0, #0]
> > r3 = tls
> >    0x0000c942 <+2>: ; <UNDEFINED> instruction: 0xee1d3f70
> > self = r3 - 120
> > r3 = *(r3 - 116) = self->dtv
> >    0x0000c946 <+6>: ldr.w r3, [r3, #-116]
> > r0 = v[1]
> >    0x0000c94a <+10>: ldr r0, [r0, #4]
> > r2 = *(r3 + r2 * 4) = dtv[r2]
> >    0x0000c94c <+12>: ldr.w r2, [r3, r2, lsl #2]
> > r0 = r0 + r2
> >    0x0000c950 <+16>: add r0, r2
> >    0x0000c952 <+18>: bx lr
> > End of assembler dump.
> > (gdb) b __tls_get_addr
> > 
> > So after pulling out the values I end up with:
> > v = 0xf77f000c
> > 0xf77f000c: 0x00000000 0x00000000 0xf77f072c 0xf77eff1c
> > v[0] = 0
> > v[1] = 0
> > self = <builtin_tls> 0xf77f064c
> > 0xf77f064c <builtin_tls>: 0xf77f064c 0xf77f06dc 0xf77f064c 0xf77f064c
> > self->dtv = <builtin_tls+144> 0xf77f06dc
> > 0xf77f06dc <builtin_tls+144>: 0x00000001 0xf77f06cc 0x00000000 0x00000000
> > return = dtv[v[0]] + v[1] = 1 + 0 = 1
> > 
> > At this point I'm out of my depth.
> 
> dtv[0] is special and just stores the length of the dtv
> (which is correctly 1, meaning there is 1 elf module with tls).
> 
> so the issue is that v[0]==0 instead of 1 (dtv[1] would hold
> the tls address of the executable) i think v[0] should be set
> up by the static linker so it may be a binutils bug.
> (or maybe static linking is special and then dtv[0] should be
> set equal to dtv[1]?)

MIPS had the exact same bug; I think it's fixed upstream now since we
had the patch for binutils 2.27 in mcm but dropped it in 2.32:

https://github.com/richfelker/musl-cross-make/blob/master/patches/binutils-2.27/0004-mips-pie-tls.diff

It affects static PIE only, so -no-pie should be able to work around
it (conditional for setting the 1 was essentially !ET_DYN instead of
is_executable).

We could work around this in musl in one of two ways: rcrt1 could
process DTPMOD relocs and write a 1, or the static version of
__init_tls could just duplicate dtv[1] in dtv[0] (the latter is only
valid up until we add static-linked dlopen, if we ever do that).
However I was fairly strongly against doing that for MIPS, since it
was a binutils bug and only affected static linking (so it's not a
runtime ABI issue; the binary either works at link-time or it doesn't)
and I'd lean towards treating ARM the same.

It should be easy to make and apply the binutils patch to fix this,
and shouldn't be too hard to grep for the same bug in all archs (since
binutils nicely duplicates this logic for every single arch, for no
reason whatsoever).

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.