Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 28 Mar 2015 15:32:41 -0700
From: Konstantin Serebryany <konstantin.s.serebryany@...il.com>
To: Konstantin Serebryany <konstantin.s.serebryany@...il.com>, Rich Felker <dalias@...c.org>, 
	musl@...ts.openwall.com
Subject: Re: buffer overflow in regcomp and a way to find more of those

On Sat, Mar 28, 2015 at 3:00 PM, Szabolcs Nagy <nsz@...t70.net> wrote:
> * Szabolcs Nagy <nsz@...t70.net> [2015-03-23 13:35:40 +0100]:
>> * Konstantin Serebryany <konstantin.s.serebryany@...il.com> [2015-03-22 21:55:26 -0700]:
>> > On Sat, Mar 21, 2015 at 6:28 AM, Szabolcs Nagy <nsz@...t70.net> wrote:
>> > > i assume for that we still need to change the libc startup code, malloc
>> > > functions and may be some things around thread stacks
>> >
>> > Try to compile a simple file with asan:
>> >
>> > int main(int argc, char **argv) {
>> >   int a[10];
>> >   a[argc * 10] = 0;
>> >   return 0;
>> > }
>> >
>> >
>> > % clang -fsanitize=address  a.c -c
>> >
>> > % nm a.o | grep U
>> >                  U __asan_init_v5
>> >                  U __asan_option_detect_stack_use_after_return
>> >                  U __asan_report_store4
>> >                  U __asan_stack_malloc_1
>> >
>> > __asan_report_store4 should print an error message saying that
>> > "bad write of 4 bytes" happened in <current stack trace> on address <param>.
>> > Also make  other __asan_report_{store,load}{1,2,4,8,16}
>> >
>> > __asan_init_v5 will be called by the module initializer.
>> > When called for the first time, it should mmap the shadow memory.
>> > https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm
>> >
>
> it seems asan intrumented code with memory access cannot run
> before __asan_init_v5 does the shadow mapping (otherwise the
> compiler generated shadow access would crash)
>
Correct.

> this is problematic for dynamic linking because the loader
> calls various libc functions so those cannot be instrumented
> unless shadow memory is already in place

Yes, I have the same trouble with glibc and have to disable
instrumentation for some of the glibc functions
(by not adding -fsanitize-address), which is not optimal (may lose
bugs on other calls to these functions).

>
> i managed to make a minimal asan runtime work with static linking
> (and then stack corruption is indeed detected).
> (i called __asan_init_v5 in the begining of musl's __libc_start_main)

Nice!


>
>> > __asan_option_detect_stack_use_after_return is a global, define it to 0.
>> > __asan_stack_malloc_1 -- just make it an empty function.
>> >
>> > Now, you can build a code with asan and detect stack buffer overflows.
>> > (The reports won't be very detailed, but they will be correct).
>> > If you add poisoned redzones to malloc -- you get heap buffer overflows.
>> > If you delay the reuse of free-d memory -- you get use-after-free.
>> >
>> > If you then implement __asan_register_globals (it is called on module
>> > initialization and poisons redzones for globals)
>> > you get global buffer overflows.
>> >
>
> i havent tried to do the heap/global poisoning
>
> it's not clear to me what's the best way to manage the shadow
> memory: mmap with PROT_NONE the entire 0x7fff8000 .. 0x10007fff8000
> range and then mmap with rw the subranges that shadow mmaped memory
> in the application?

You probably can do it because you control all mmap calls from libc
(from malloc and thread stack creation),
but the first time the user calls mmap syscall bypassing libc it will break.
We use MAP_NORESERVE to map the entire range at startup.
This has a drawback that the application uses 16Tb of virtual address
space and tools like "ulimit -v" do not work.
But otherwise this works great.

>
> then a modified mmap is needed to manage the shadow maps
>
> so i think for a asan+cov instrumented libc:
>
> - [S]crt1.s should do the initial shadow mmap before any c code gets run
> - mmap should be replaced to do shadow management

Only if you do not use the MAP_NORESERVE trick.

> - malloc etc should be replaced to handle shadow poisoning
> - the minimal asan and cov runtimes should be added to libc
> (so their symbols are available early in the loader)
>
> and then we can use such a libc for testing and fuzzing
> to catch heap/stack corruptions
>
> i guess it is possible to have a /lib/ld-muslasan-x86_64.so.1
> and Scrt1asan.o on a system and the compiler/linker could
> use those when compiling some code with asan+cov instrumentation

sounds great.

> (but this can get ugly if there will be more instrumentations
> that need runtime support in the future)
Yea. The core of asan run-time is relatively easy to replicate, as
you've seen.
Probably, one can replicate msan and ubsan (MemorySanitizer,
UndefinedBehaviorSanitizer)
with comparable effort since most of the logic for those tools is in
the compiler.
The use-after-return detection in asan relies on a very non-trivial
part of run-time.
tsan (ThreadSanitizer) has much more complex run-time which is hard to
replicate.

Maybe someday we'll make them working with static linking, but not any
time soon. :(

--kcc

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.