Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Sun, 17 Apr 2022 12:50:32 -0700
From: Fangrui Song <i@...kray.me>
To: musl@...ts.openwall.com
Cc: Rich Felker <dalias@...c.org>, Timo Teras <timo.teras@....fi>
Subject: Re: option to enable eh_frame

On Sat, Jul 17, 2021 at 12:33 AM Timo Teras <timo.teras@....fi> wrote:
>
> On Fri, 16 Jul 2021 11:57:35 -0400
> Rich Felker <dalias@...c.org> wrote:
>
> > On Fri, Jul 16, 2021 at 12:16:25PM +0300, Timo Teras wrote:
> >
> > The debugger already can do debugging/unwinding because it has access
> > to the debug information (if you've installed it) and there is a
> > clear, understood-by-users contract that this information is not an
> > inherent part of the program but something optional for external
> > debugging tools only.
>
> This apparently the fundamental difference in thinking. The debugging
> information is often huge, and people don't want to install it on
> production or embedded devices. It ships ton of other features which
> are not needed. However, the unwind/backtrace feature does provide
> significant value for error reports and profiling.
>
> > > > Arguments to have .eh_frame:
> > >  - It allows debugging things even if musl-dbg is not or cannot be
> > >    installed for some reason (e.g. gdb, valgrind), or is no longer
> > >    available
> > >  - libunwind/libexecinfo will start to work and give meaningful
> > >    backtraces
> >
> > This is explicitly a reason not to. backtrace() considered harmful.
>
> Please elaborate. I agree it's harmful from SEGV handler when program
> might be targeted. But used in other situations where it is useful. Do
> you have additional reasons why backtrace() is harmful?
>
> Do note that libunwind --enable-debug-frame is enabled default by
> upstream package on ARM and aarch64, and this would work if the debug
> package is installed.
>
> This gives inconsistent behaviour based on arch and whether -dbg is
> installed. Developer might think the unwind stuff works, only to find
> it breaks on different install, and make the application depend on -dbg.
>
> > >  - Continuous kernel based profiling (e.g. perf record -g dwarf)
> > > will work
> >
> > This already works if you have debug info.
>
> As said, it was desire to make it work without debug info.
>
> > > Given that the main arguments against are either making UB crash, or
> > > not the best fix, and keeping eh_frame enables useful features to
> > > work, I think it would make sense to allow enabling it.
> > >
> > > Please consider the the attached patch to make it a configure
> > > option to enable keeping eh_frame (defaulting still to not keeping
> > > it).
> >
> > You can solve this problem just as well for the things you want to
> > have work by including the (part of) the debug info you want in the
> > main libc.so binary: .debug_frame. Of course I can't stop Alpine from
> > doing it in a different way locally, but I would strongly recommend
> > you do that rather than making a contract that diverges from musl.
>
> The .debug_frame in main is a possibility, but no one else does it.
> Well, apparently some ship the full debug build always to achieve some
> of the same things. I think this also achieves the same divergence from
> the normal expectation.
>
> As small side note, the -fno-*unwind-tables flags are also in "tryflags"
> so if compiler would not support it, you don't get it. So depending on
> compiler you might still get .eh_frame ;)
>
> The .eh_frame is a ELF abi feature normally turned on, and usually
> needs good reasons to turn it off. Could you Rich please explain in
> detail all the reasons why you think it's so evil? Mostly I heard
> "feature X is evil and people should not use it" and I find that
> your personal opinion, not a valid technical reason. But listening
> more and earlier things, I've heard:
>
> 1. Having it makes it seem an application can jump through C library
>    from e.g. qsort callback. Not having .eh_frame makes C++ exception
>    fail, but does not help with e.g. setjmp/longjmp. So this is not
>    complete solution.
>
> 2. Explicitly disabling backtrace() universally. I assume this is to
>    disable to widespread usage of it from SEGV handler. But there
>    seem to be also implementations reading .debug_frame, so this does
>    not really fix that either fully. And I think there are valid use
>    cases for backtrace() also, e.g. when having debug build and having
>    internal debug printing trigger it.
>
> 3. Adds little bit if size to the binary. And also the runtime memory
>    usage as it goes to LOAD segment. But 80kB is not a concern on
>    modern environment even on embedded platforms.
>
> 4. You want to still enforce this so people don't do 1 or 2.
>    For this, time has taught me, people need education, not enforcement.
>    If you don't teach them, they'll still go and do the same stupid
>    things even more stupid ways.
>
> Please add in any reasons I may have missed. I would like to have your
> complete list of reasons why to remove .eh_frame. So far it has been
> coming out in pieces. I'd like constructive discussion if some of these
> items could be implemented stronger in other means than removing
> .eh_frame.
>
> Please also explain why you think the .eh_frame is sufficient even
> though it does not cover various situations as explained above.
>
> Now it may have been little bit early for me to go ahead and merge the
> change in Alpine before going through this discussion. The intention
> was not upset anyone or be unilateral. This is why I made the PR and
> waited for a good while for comments. Unfortunately, I did not find the
> comments convincing, but granted I should have asked the above questions
> before merging.
>
> Currently, I still think reasons to have .eh_frame than not are more
> weighty - it enables valid use cases in expected way. But I'm also
> willing to revert if there's stronger reasons to do it. Either
> technical, or just keeping peace. So far the technical reasons do not
> convince me, but keeping peace and good relationship is something I
> definitely want.
>
> I personally did not see this as an ABI or a "contract" change, and this
> is perhaps why I made the merge earlier than later.
>
> While the "contract" may *seem* somewhat different. It was still
> mentioned in the commit message that the change was not made to make
> C++ exception through C library work. The "contract promise" has not
> changed; though, granted, the user perceived "de facto contract" has
> changed in this case.
>
> As said, for me the primary reason is observability: being able to gdb
> and get back traces, and continuous profiling work without debug info.
> Those are precious things to have on system where debug info is simply
> not available. As mentioned backtrace() working is a side effect, and I
> think it has also valid use cases even if it's also often misused.
>
> But fundamentally I think if we ship .debug_frame, all people wanting
> do backtrace() and the silly stuff, will just enable .debug_frame
> support in their code and still do the silly stuff in worse way, than
> if .eh_frame was enabled. Since technically they are the same, even if
> the intended use case is different. As seen libunwind does have
> --enable-debug-frame.
>
> For me it's quite unusual to have .debug_frame but not other debug
> sections. I assume it works. But the tooling certain does not endorse
> this setup. I suppose it can be achieved with some objcopy magic, but
> requires care.
>
> Timo

I know this is a very old thread but I'd like to share some findings
on unwinding through musl functions.

https://maskray.me/blog/2022-04-10-unwinding-through-signal-handler#musl-x86-64

My main finding is that nongnu libunwind does not recognize
__restore_rt as a signal trampoline on x86-64.
If -funwind-tables is used, __restore_rt may need some CFI instructions.
linux perf seems to use libunwind. I am not familiar with the tool and
so would like to hear how unwind tables improve linux perf usage on
musl.

An alternative to -funwind-tables is -fno-omit-frame-pointer.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.