Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Fri, 7 Jan 2022 11:41:49 -0800
From: Colin Cross <ccross@...gle.com>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: [PATCH] Add mallinfo2 and mallinfo

On Thu, Jan 6, 2022 at 7:32 PM Rich Felker <dalias@...c.org> wrote:
>
> On Thu, Jan 06, 2022 at 03:42:52PM -0800, Colin Cross wrote:
> > On Thu, Jan 6, 2022 at 2:00 PM Rich Felker <dalias@...c.org> wrote:
> > >
> > > On Thu, Jan 06, 2022 at 12:37:09PM -0800, Colin Cross wrote:
> > > > glibc introduced mallinfo2 [1], which solves some of the arguments [2]
> > > > against including mallinfo in musl by expanding the width of the
> > > > returned counters from int to size_t.
> > > >
> > > > This patch implements mallinfo2 without requiring any additional
> > > > metadata.  It iterates through the meta_areas and metas in order
> > > > to count mmap, large and small allocations, and produces ordblks,
> > > > hblks, hblkhd, uordblks and fordblks values.
> > > >
> > > > Once mallinfo2 exists, it is trivial to implement mallinfo that caps
> > > > the mallinfo2 outputs such that they fit in the int fields returned
> > > > by mallinfo.
> > > >
> > > > [1] https://sourceware.org/git/?p=glibc.git;a=commit;h=e3960d1c57e57f33e0e846d615788f4ede73b945
> > > > [2] https://www.openwall.com/lists/musl/2018/01/17/2
> > >
> > > Historically, mallinfo was omitted intentionally in musl partly
> > > because of the wrong-types issue (fixed by mallinfo2), but also partly
> > > because the set of data items returned is built around certain
> > > assumptions about the malloc implementation that aren't necessarily
> > > valid, especially for our allocators. This could be revisited, but I'm
> > > not sure we'll find good justification to add it.
> >
> > Many of the mallinfo fields are meaningless and left zero.  I left
> > arenas and keepcost zero because mallocng never puts allocations on
> > the heap, only metadata.  Similarly, all the fastbin-related fields
> > are zero because fastbin seems very specific to glibc's
> > implementation.  The remainder seem to map to mallocng fairly well:
> > hblks and hblkhd track the number and total size of mmap allocations,
> > uordblks tracks the total size of all allocations, ordblks and
> > fordblks track the number and total size of the free slots.
>
> Here I'd tend to disagree and I think this highlights how the model
> doesn't map well. I would not call the metadata "the heap" just
> because it's (when possible) obtained with brk. I would call "the
> heap" the storage that's managed by the allocator's bookkeeping for
> reuse without going through a per-allocation syscall to allocate and
> free it, and distinguish that only from individually-mmapped units
> that are guaranteed to be unmapped when freed.

I think you're right, I'll fix arena to count the bytes in the grouped
allocations.

> > > > The motivation for this patch is an attempt to use musl instead of glibc
> > > > to build host tools used when building the Android platform and the
> > > > tools that are distributed to app developers as part of the Android SDK.
> > > > mallinfo is used in a variety of third-party code built as part of
> > > > building Android, and tests and benchmarks in the Android tree.
> > > >
> > > > The implementation has been lightly tested with bionic's malloc.mallinfo
> > > > and malloc.mallinfo2 tests, which verify that a variety of different
> > > > allocation sizes result in an increase of the uordblks value by at
> > > > least the usable size of the returned allocation.
> > > >
> > > > I can keep this as a local patch in Android if it is still not acceptable
> > > > for musl.
> > >
> > > Is there a reason not to just #ifdef HAVE_MALLINFO it out, or do a
> > > dummy implementation, or one that makes up semi-reasonable numbers
> > > purely based on /proc/self/maps without poking at malloc internals?
> >
> > It's often used in tests and benchmarks to verify that calling the
> > method repeatedly doesn't leak memory.  Given how easy it was to
> > implement I'd probably keep this implementation in Android rather than
> > try to deduce it from /proc/self/maps, which could contain many
> > non-malloc anonymous pages and can't tell the difference between
> > allocated and freed memory.
>
> For testing lack of memory leak, I'd think "whole process vm size"
> would be more meaningful than something from mallinfo. For example
> mallinfo wouldn't catch accidentally re-mmapping a file each time it's
> used without unmapping it, or manual mmap of MAP_ANON.
>
> > Somewhat relatedly, some patches I wrote for the kernel in Android to
> > allow naming anonymous memory regions have finally gone into the
> > linux-next branch [1].  If that becomes widely enabled then musl's
> > allocator could tag mmap regions, which would provide more information
> > for parsing stats out of /proc/self/maps.  It still couldn't tell the
> > difference between allocated and free though.
>
> I don't think this is something we'd do; it would significantly
> increase runtime cost. (Keep in mind mallocng's groups are not
> intended to necessarily be long-lived.)
>
> > > > diff --git a/src/malloc/mallocng/mallinfo.c b/src/malloc/mallocng/mallinfo.c
> > > > new file mode 100644
> > > > index 00000000..c60840b1
> > > > --- /dev/null
> > > > +++ b/src/malloc/mallocng/mallinfo.c
> > > > @@ -0,0 +1,73 @@
> > > > +#include <limits.h>
> > > > +#include <malloc.h>
> > > > +#include <stddef.h>
> > > > +
> > > > +#include "glue.h"
> > > > +#include "meta.h"
> > > > +
> > > > +static void accumulate_meta(struct mallinfo2 *mi, struct meta *g) {
> > > > +  int sc = g->sizeclass;
> > > > +  if (sc >= 48) {
> > > > +    // Large mmap allocation
> > > > +    mi->hblks++;
> > > > +    mi->uordblks += g->maplen*4096;
> > > > +    mi->hblkhd += g->maplen*4096;
> > > > +  } else {
> > > > +    if (g->freeable && !g->maplen) {
> > > > +      // Small size slots are embedded in a larger slot, avoid double counting
> > > > +      // by subtracing the size of the larger slot from the total used memory.
> > > > +      struct meta* outer_g = get_meta((void*)g->mem);
> > > > +      int outer_sc  = outer_g->sizeclass;
> > > > +      int outer_sz = size_classes[outer_sc]*UNIT;
> > > > +      mi->uordblks -= outer_sz;
> > > > +    }
> > > > +    int sz = size_classes[sc]*UNIT;
> > > > +    int mask = g->avail_mask | g->freed_mask;
> > > > +    int nr_unused = __builtin_popcount(mask);
> > > > +    mi->ordblks += nr_unused;
> > > > +    mi->uordblks += sz*(g->last_idx+1-nr_unused);
> > > > +    mi->fordblks += sz*nr_unused;
> > > > +  }
> > > > +}
> > >
> > > For upstreaming, __builtin_popcount wouldn't be usable. But aside from
> > > that, the approach here looks roughly correct. I don't see any
> > > correction for the case where a g->last_idx==1 and sc<48, in which
> > > case it's possible that map_len is less than the length for the size
> > > class. These should probably be treated like "individually mmapped"
> > > allocations. This is one place where trying to fit the mallinfo data
> > > model with an allocator that doesn't match its assumptions is
> > > something of a hack.
> >
> > I'll take a look at this, I assume you mean when g->last_idx==0?
>
> Yes. If g->last_idx==0, you need to use g->maplen to determine the
> size. Whether that's better counted as "individually mmapped" or
> "heap" I'm not sure.

Fixed in V2.  I counted it as "heap", since the mallinfo man page
explicitly mentions a size threshold for using mmap, and these are
below mallocng's mmap threshold.  I don't think it matters much which
way it is counted though.

> > > > +static void accumulate_meta_area(struct mallinfo2 *mi, struct meta_area *ma) {
> > > > +  for (int i=0; i<ma->nslots; i++) {
> > > > +    if (ma->slots[i].mem) {
> > > > +      accumulate_meta(mi, &ma->slots[i]);
> > > > +    }
> > > > +  }
> > > > +}
> > > > +
> > > > +struct mallinfo2 mallinfo2() {
> > > > +  struct mallinfo2 mi = {0};
> > > > +
> > > > +  rdlock();
> > > > +  struct meta_area *ma = ctx.meta_area_head;
> > > > +  while (ma) {
> > > > +    accumulate_meta_area(&mi, ma);
> > > > +    ma = ma->next;
> > > > +  }
> > > > +  unlock();
> > > > +
> > > > +  return mi;
> > > > +}
> > > > +
> > > > +#define cap(x) ((x > INT_MAX) ? INT_MAX : x)
> > > > +
> > > > +struct mallinfo mallinfo() {
> > > > +  struct mallinfo mi = {0};
> > > > +  struct mallinfo2 mi2 = mallinfo2();
> > > > +
> > > > +  mi.arena = cap(mi2.arena);
> > > > +  mi.ordblks = cap(mi2.ordblks);
> > > > +  mi.smblks = cap(mi2.smblks);
> > > > +  mi.hblks = cap(mi2.hblks);
> > > > +  mi.hblkhd = cap(mi2.hblkhd);
> > > > +  mi.usmblks = cap(mi2.usmblks);
> > > > +  mi.fsmblks = cap(mi2.fsmblks);
> > > > +  mi.uordblks = cap(mi2.uordblks);
> > > > +  mi.fordblks = cap(mi2.fordblks);
> > > > +  mi.keepcost = cap(mi2.keepcost);
> > > > +
> > > > +  return mi;
> > > > +}
> > > > --
> > > > 2.34.1.448.ga2b2bfdf31-goog
> > >
> > > If the API is added upstream, it really should be provided by both
> > > mallocng and oldmalloc, with the legacy mallinfo (int) wrapper, if
> > > any, in src/malloc rather than src/malloc/mallocng. Available
> > > functions should not differ based on --with-malloc choice.
> >
> > I can add the other implementations if this is likely to be accepted,
> > if I'm keeping this in Android I won't bother.
>
> Indeed, this is only relevant if it's adopted as public API.
>
> Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.