oss-security - Re: Out-of-bounds read & write in the glibc's qsort()

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Mon, 5 Feb 2024 15:37:24 -0300
From: Adhemerval Zanella Netto <adhemerval.zanella@...aro.org>
To: Solar Designer <solar@...nwall.com>,
 Qualys Security Advisory <qsa@...lys.com>
Cc: oss-security@...ts.openwall.com
Subject: Re: Out-of-bounds read & write in the glibc's qsort()



On 05/02/24 14:23, Solar Designer wrote:
> On Mon, Feb 05, 2024 at 03:56:41PM +0000, Qualys Security Advisory wrote:
>> On Sun, Feb 04, 2024 at 05:35:20PM +0100, Solar Designer wrote:
>>> It's so invasive I cannot easily tell whether qsort() remained robust
>>> after it or not.  There's no longer a "tmp_ptr != base_ptr &&" check.
>>> So, lacking known-working tests in glibc tree, we don't know about glibc
>>> 2.39's status with respect to this issue.
>>
>> The "tmp_ptr != base_ptr" bounds check was originally added to the
>> _quicksort() function, but is not needed anymore in glibc 2.39 because
>> the old fallback to quick sort (the _quicksort() function) has been
>> completely removed and replaced by a fallback to heap sort.
>>
>> Note, just in case: we have not reviewed the implementation of this new
>> fallback to heap sort.
> 
> Oh, I should have spent a bit more time looking at the latest glibc
> before posting.  I just did.  So it indeed did not reintroduce this same
> issue.  That's great.
> 
> Regarding the tests, I now see that one of them explicitly calls
> heapsort_r(), so it tests that fallback code in this way, however the
> rest simply call qsort() or qsort_r(), so they only test non-fallback
> code.  It'd improve code coverage of these tests if they first do what
> they do now, and then repeat the same after setting RLIMIT_AS to 0.

Thanks for heads up, I will see if I can take some time to improve
the current qsort coverage to test both mergesort and the heapsort
fallback for all tests.

The way we usually test internal interfaces (such as the heapsort 
fallback) is to create 'internal' tests that are essentially
static linked ones that call the internal glibc interfaces. 

I think they would be a better alternative, specially because some test
do require malloc to create temporary buffers and the RLIMIT_AS trick
will add some extra complexity in such cases.

> 
> On Mon, Feb 05, 2024 at 05:02:52PM +0800, Alexander E. Patrakov wrote:
>> On Mon, Feb 5, 2024 at 4:45???PM Alexander E. Patrakov <patrakov@...il.com> wrote:
>>> On Mon, Feb 5, 2024 at 4:40???PM Alexander E. Patrakov <patrakov@...il.com> wrote:
>>>> On Mon, Feb 5, 2024 at 12:36???AM Solar Designer <solar@...nwall.com> wrote:
>>>>> I don't have a glibc 2.39 build handy.  Perhaps someone on a distro that
>>>>> has already updated can run the attached test program and let us know?
>>>>
>>>> Here you go: no output on Arch Linux.
>>>>
>>>> [aep@...-haswell tmp]$ gcc ./glibc-qualys-rocky-qsort-test.c
>>>> [aep@...-haswell tmp]$ ./a.out
>>>> [aep@...-haswell tmp]$ /lib64/libc.so.6
>>>> GNU C Library (GNU libc) stable release version 2.39.
> 
>>> Sorry, I should have followed the instructions.
>>>
>>> [aep@...-haswell tmp]$ while true; do n=$((RANDOM*64+RANDOM+1));
>>> prlimit --as=$((n*4/2*3)) ./a.out $n; done
>>>
>>> This results in a mix of these outputs:
>>>
>>> PASSED
>>> ./a.out: error while loading shared libraries: libc.so.6: failed to
>>> map segment from shared object
>>> Segmentation fault
> 
>> Upon investigation, I have to add: the segmentation faults come from
>> code that runs before main(), so they do not indicate a problem in
>> qsort().
> 
> Sorry, I should have included usage instructions.  It's like this:
> 
> gcc glibc-qualys-rocky-qsort-test.c -o glibc-qualys-rocky-qsort-test -O2
> while true; do n=$((RANDOM*64+RANDOM+1)); echo $n; ./glibc-qualys-rocky-qsort-test $n; done
> 
> In other words, almost same as Qualys', but with prlimit omitted because
> the program itself now takes care of it.  With our current patched glibc
> in Rocky Linux SIG/Security, the output is like this:
> 
> 396121
> PASSED
> 77207
> PASSED
> 683895
> PASSED
> 1402983
> PASSED
> 
> and so on.  No crashes anymore.  Before the one-line patch, it would hit
> the test program's abort() within seconds, like Qualys had observed:
> 
> 153916
> PASSED
> 990497
> PASSED
> 1501673
> PASSED
> 1344354
> PASSED
> 176197
> PASSED
> 326004
> Aborted (core dumped)
> 1892398
> Aborted (core dumped)
> 834837
> PASSED
> 2066676
> PASSED
> 589237
> Aborted (core dumped)
> 
> As to the occasional segfaults when you do use prlimit, I also saw them
> on Rocky Linux 9.  They appeared to come from the kernel right after
> execve() fails and kind of returns control back to prlimit.  I think
> they're a symptom of execve() concluding it ran out of memory too late
> for it to allow the original program to continue running.  As I recall
> from patching this code in the kernel many years ago, such conditions
> did and probably still do exist.  That's kind of fine.
> 
> Alexander
Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.