Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 26 Jan 2019 00:11:37 +0100
From: Szabolcs Nagy <nsz@...t70.net>
To: r yang <decatf@...il.com>, musl@...ts.openwall.com
Subject: Re: Infinite loop in malloc

* Szabolcs Nagy <nsz@...t70.net> [2019-01-25 23:28:32 +0100]:
> * r yang <decatf@...il.com> [2019-01-25 10:13:50 -0500]:
> > pmbootstrap is a development environment to build/install postmarketOS
> > (based on Alpine Linux) for Android devices. One of the things it does
> > is use qemu static to emulate an ARM based Alpine Linux chroot
> > environment.
> > 
> > There is a bug while compiling certain packages in the qemu ARM chroot.
> > The qemu process can get stuck in an infinite loop when calling malloc.
> > 
> > pmbootstrap uses Alpine Linux edge repositories. It's using the current
> > musl package version 1.1.20.
> > 
> > Here is a gdb backtrace.
> > #0  malloc (n=<optimized out>, n@...ry=9) at src/malloc/malloc.c:320
> > #1  0x0000000060184ad3 in g_malloc (n_bytes=n_bytes@...ry=9) at gmem.c:99
> > #2  0x000000006018bcab in g_strdup (str=<optimized out>, str@...ry=0x60200abf "call_rcu") at gstrfuncs.c:363
> > #3  0x000000006016e31d in qemu_thread_create (thread=thread@...ry=0x7ffe89fb1a10, name=name@...ry=0x60200abf "call_rcu",
> >     start_routine=start_routine@...ry=0x60174c00 <call_rcu_thread>, arg=arg@...ry=0x0, mode=mode@...ry=1) at /home/pmos/build/src/qemu-3.1.0/util/qemu-thread-posix.c:526
> > #4  0x0000000060174b99 in rcu_init_complete () at /home/pmos/build/src/qemu-3.1.0/util/rcu.c:327
> > #5  0x00000000601c4fac in __fork_handler (who=1) at src/thread/pthread_atfork.c:26
> > #6  0x00000000601be8db in fork () at src/process/fork.c:33

it seems the issue is simply that qemu-arm-static is a multi-threaded
process and here it forks and calls malloc in the fork handler of
the child process.

it's easy to imagine that if fork runs concurrently with a free
the malloc state remains corrupted in the child hence the malloc
fails there.

i'm not sure if musl can detect or fix this up easily.


> > #7  0x000000006009d191 in do_fork (env=0x62ef0ed0, flags=flags@...ry=17, newsp=newsp@...ry=0, parent_tidptr=parent_tidptr@...ry=0, newtls=newtls@...ry=0,
> >     child_tidptr=child_tidptr@...ry=0) at /home/pmos/build/src/qemu-3.1.0/linux-user/syscall.c:5528
> > #8  0x00000000600af894 in do_syscall1 (cpu_env=cpu_env@...ry=0x62ef0ed0, num=num@...ry=2, arg1=arg1@...ry=0, arg2=arg2@...ry=-8700192, arg3=<optimized out>, arg4=8,
> >     arg5=1015744, arg6=-75664, arg7=0, arg8=0) at /home/pmos/build/src/qemu-3.1.0/linux-user/syscall.c:7042
> > #9  0x00000000600a835c in do_syscall (cpu_env=cpu_env@...ry=0x62ef0ed0, num=2, arg1=0, arg2=-8700192, arg3=<optimized out>, arg4=<optimized out>, arg5=1015744, arg6=-75664,
> >     arg7=0, arg8=0) at /home/pmos/build/src/qemu-3.1.0/linux-user/syscall.c:11533
> > #10 0x00000000600c265f in cpu_loop (env=env@...ry=0x62ef0ed0) at /home/pmos/build/src/qemu-3.1.0/linux-user/arm/cpu_loop.c:360
> > #11 0x00000000600417a2 in main (argc=<optimized out>, argv=0x7ffe89fb5958, envp=<optimized out>) at /home/pmos/build/src/qemu-3.1.0/linux-user/main.c:819
> > 
> > 
> > It is taking the malloc code path where n <= MMAP_THRESHOLD. None of
> > the conditions which break from the for loop are met.
> > 
> > In the first condition the mask value is never zero:
> >     mask = mal.binmap & -(1ULL<<i);
> >     if (!mask) { ... }
> > 
> > Examining the value in gdb:
> > (gdb) printf "%X\n", mask
> > 204701
> > 
> > The bin head points to the bin itself so this condition is never met:
> >     c = mal.bins[j].head;
> >     if (c != BIN_TO_CHUNK(j)) { ... }
> > 
> > Examining the values in gdb:
> > (gdb) printf "%X\n", mal.bins[j].head
> > 62337FC0
> > (gdb) printf "%X\n", (struct chunk *)((char *)(&mal.bins[j].head) - (2*sizeof(size_t)))
> > 62337FC0
> > 
> > 
> > Reproducing this issue:
> > It is not always 100% reproducible. On occasion it will not get stuck
> > in an infinite loop. With my testing on 2 computers, will happen on
> > most attempts to compile.
> 
> thanks i managed to reproduce this on my laptop with the commands below.
> i'll try to look into it.
> 
> > 
> > $ git clone https://gitlab.com/postmarketOS/pmbootstrap.git
> > $ cd pmboostrap
> > 
> > Configure pmbootstrap
> > $ ./pmbootstrap.py init
> > 
> > Enter an Android device when prompted.
> > Use device: samsung-i9100
> > Leave other settings as the default.
> > 
> > Check out the pmaports repository that will reproduce this issue.
> > $ cd /path/to/pmboostrap/aports
> > $ git remote add ryang2678 https://gitlab.com/ryang2678/pmaports.git
> > $ git fetch ryang2678 debug-musl-malloc
> > $ git checkout debug-musl-malloc
> > 
> > Compile qemu static with debug symbols.
> > Alpine Linux doesn't provide a qemu package with debug symbols.
> > The debug-musl-malloc branch contains a qemu APKBUILD with debugging
> > enabled.
> > $ cd /path/to/pmboostrap
> > $ ./pmbootstrap.py build qemu
> > 
> > Try to compile networkmanager and wait for build to get stuck.
> > $ ./pmbootstrap.py build networkmanager --arch=armhf --force
> > 
> > 
> > To observe the stuck qemu process:
> > 
> > Enter chroot shell:
> > $ ./pmbootstrap.py chroot
> > 
> > Install musl debug symbols.
> > $ apk add musl-dbg
> > 
> > Get musl source code
> > $ cd /home/pmos
> > $ git clone git://git.musl-libc.org/musl
> > $ cd /home/pmos/musl
> > $ git checkout v1.1.20
> > 
> > Attach gdb to stuck process
> > $ gdb -tui /usr/bin/qemu-arm
> > directory /home/pmos/musl
> > attach <pid>

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.