Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Mon, 2 Feb 2015 22:54:07 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: GNU Emacs LD_PRELOAD build hack

Background: GNU Emacs' build process depends on the ability of the
build-stage binary (temacs) to "dump" itself to a new executable file
containing preloaded lisp objects/state in its .data segment. This
process is highly non-portable even in principle; in practice, the big
issue is where malloc allocations end up. They need to all be
contiguous just above the .data/.bss in the original binary so that
they can become part of the .data mapping. Against musl's malloc, this
has two major ways it can fail:

1. musl uses mmap for large allocations (roughly, > 128-256k) and has
   no mechanism for obtaining such large objects from the main
   brk-based heap or even requesting such (whereas glibc has mallopt
   and/or an environment variable to control the mmap threshold, and
   emacs cheats and uses that to control glibc).

2. musl reclaims the gaps around the edges of writable mappings in the
   main program and shared libraries and uses them for malloc. If
   these are in shared libraries, they won't be dumped at all, and if
   they're in the main program, they actually overlap with .text on
   disk (the same page is mapped twice; this is the cause of the gaps)
   and thus the .text, not the heap data, gets written out to disk by
   the dumper.

Emacs provides its own malloc replacement and tries to use it by
default, but this has to be disabled with musl, since replacing malloc
in dynamic programs doesn't work (and static binaries don't work right
at all with emacs' dumper because libc state would get included in the
dump -- state which is "intentionally lost" when it resides in a
shared library whose state isn't dumped).

The right solution: As I discussed on the emacs-devel list nearly a
year ago, the right solution is to get rid of the non-portable code in
emacs, dumping the lisp heap and its data (rather than the whole
program) to a file and either mmapping it at runtime (and possibly
relocating pointers in it, if the new location it's loaded at differs)
or converting it to a C source file that's compiled and linked and for
which the (static or dynamic) linker can perform relocations at
link/load time. This solution also solves a number of other serious
issues related to the dumper, including its incompatibility with PIE
binaries.

Unfortunately, the right solution requires a significant overhaul by
someone with expertise in emacs internals, and it's not practical in
the short term. Meanwhile, we have users wanting emacs on musl-based
distros (myself included).

So, here's an alternate solution.

The hack: The basic trick is that we need to satisfy emacs assumptions
about malloc, but only at build (dumping) time, not permanently. My
first thought was to build emacs in the presence of a modified musl
libc.so whose malloc never uses mmap (issue 1) and never reclaims gaps
at the edge of writable mappings (issue 2), but then I realized we
could achieve the same thing without having to build a custom libc.so
at package-build time by exploiting LD_PRELOAD.

The attached file is my current draft of the LD_PRELOAD module to be
loaded when running temacs to dump. In short, what it does is:

- Throws away (wastes/leaks) and retries whenever it gets a result
  from malloc that's not between the initial value of the brk and the
  current (after malloc) value of the brk, i.e. anything not on the
  "main heap" that's contiguous with .bss/.data.

- For large allocations that would be serviced by mmap, and for which
  musl's malloc won't/can't allocate from the "main heap", allocate
  64k at a time, many times, from the heap, and exploit knowledge of
  the malloc chunk header/footer structures to paste them together to
  make one large chunk. (If the wrapper can't get contiguous chunks
  for this, then malloc will just fail and report failure.)

The first part of the hack is simple and clean. The second part is
hideously ugly, but the key point to realize is that it's only making
an assumption about the library implementation used at build time, not
when the emacs binary is later run. The dumped emacs does not include
any code from the LD_PRELOAD hack and it does not depend on the
assumptions made in the hack still being valid for the libc.so that's
used at runtime. If these assumptions do become invalidated (unlikely,
but possible), then all that's needed to get emacs building again is
updating the hack (or just building with an outdated libc.so). With
any luck, the non-portable dumping in emacs will be fixed long before
this is needed, anyway.

Over the next few days I hope to be working with people doing Alpine
Linux (and/or other dists) packaging to get this turned into a clean,
reproducible build procedure for GNU-Emacs-on-musl. In the mean time,
the source for the hack is attached in case anyone wants to start
hacking on it.

Rich

View attachment "preload.c" of type "text/plain" (1799 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.