musl - Re: Wrong info in libc comparison

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170916140110.p4xiuzvsuarfcfk4@voyager>
Date: Sat, 16 Sep 2017 16:01:10 +0200
From: Markus Wichmann <nullplan@....net>
To: musl@...ts.openwall.com
Subject: Re: Wrong info in libc comparison

On Sat, Sep 16, 2017 at 11:37:53AM +0200, Szabolcs Nagy wrote:
> * Markus Wichmann <nullplan@....net> [2017-09-15 21:18:46 +0200]:
> > On Wed, Sep 13, 2017 at 03:53:06PM -0400, Rich Felker wrote:
> > > If you're considering big-O, where n->infinity (or at least to the
> > > largest value that can fit in memory), malloc most certainly has
> > > failed (because the array to be sorted already filled memory) and
> > > you're looking at the "fallback" case.
> > > 
> > 
> > I think we're getting sidetracked here. Every libc worth its salt uses a
> > loglinear sorting algorithm. Thus they are all equal in that regard.
> 
> that is not true at all.
> embedded libcs are often optimized for size, not worst case behaviour.
> note that worst-case behaviour is not just big-O..
> (e.g. glibc uses mergesort which uses malloc which means it's not as-safe,
> may introduce arbitrary latency since malloc can be interposed, concurrent
> mallocs can delay forward progress, large allocation may cause swapping,
> cancellation or longjmp out of the cmp callback can leak memory etc.)
> 

Did you even read what I wrote? Rich talked about big-O, i.e. complexity
theory, to which I remarked that most algorithms in use are loglinear
and thus equal _in_that_regard_.

And I wrote a bit later that the only exception to this that I know of
is uclibc, which uses Shell sort with Pratt's sequence. uclibc claimed
to be optimized for smaller systems and is thus exactly an example of
your second sentence here. And your third point is what I wrote just a
few lines further below, albeit with a different example.

BTW, in addition to the libcs presented on the libc comparison page, I
had a look at newlib and avr-libc, and they both feature quicksort (and
at least for avr-libc I can't figure out why they did that. Maybe
habit).

> > > Maybe the comparison of sort algorithm used is interesting for reasons
> > > other than just big-O though, in which case mentioning the "merge
> > > (when it fits in memory)" would probably be helpful.
> > > 
> > > Rich
> > 
> > Algorithms can be compared on a number of metrics, and just the name
> > doesn't tell us much (e.g. quicksort with naive "first element" pivot
> > selection has a pathological case on sorted input, while quicksort with
> > med3 pivot selection handles that very well). If you really want to know
> > something specific, you'll have to look it up in source, anyway.
> 
> "mergesort+quicksort" sounds good to me,
> it tells enough about what's going on, if there is some
> known implementation mistake that can be added to the
> description (like "naive" quicksort for dietlibc implying
> O(n^2) worst case compares and potentially large stack use)

Agreed. As I said, if you want to know specifics, looking up keywords is
not the way to go, anyway, and since all of these libcs are open source,
someone wanting to more will have no excuse for not looking up what they
want to know in source. It is in the end the only way to be sure.

Ciao,
Markus
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.