musl - Re: Use of size_t and ssize

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130704063740.GL29800@brightrain.aerifal.cx>
Date: Thu, 4 Jul 2013 02:37:40 -0400
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Re: Use of size_t and ssize_t in mseek

On Thu, Jul 04, 2013 at 08:11:58AM +0200, Jens Gustedt wrote:
> Hello Rich,
> 
> Am Mittwoch, den 03.07.2013, 21:28 -0400 schrieb Rich Felker:
> > The requirements for printf_s, scanf_s, and related functions look
> > quite invasive and would affect programs not using these interfaces.
> 
> unless one would finally implement them separately, of course

Yes but that's a huge maintenance burden (duplicate functionality) and
while it's less bloat for individual static apps that don't use Annex
K, it's much more bloat for libc.a and libc.so.

> > Otherwise, the Annex K interfaces look like a considerable amount of
> > bloat with highly questionable usefulness, but mostly non-invasive. My
> > feeling is that we should hold off on a decision about them to see if
> > any applications actually start using them.
> 
> If just some if conditionals are bloat for you, yes.
> These conditionals could easily be tagged as likely/unlikely to
> privilege the fastpath.

No, I mean just the sheer volume of interfaces to add.

> > > Then some interfaces are clearly different such that they can't simply
> > > be copied over, notably bsearch and qsort functions, since they
> > > receive additional arguments to provide context to the object
> > > comparison.
> > 
> > These are much easier; the extra argument can be passed via TLS. It's
> > printf_s and scanf_s that are hard.
> 
> Hm, I don't see how this can be done "easily", and in particular such
> that there is no performance loss for qsort. I think for these
> functions performance is important in any type of platform.

qsort_s can store the comparison function and context in TLS, and then
pass to qsort a comparison function that grabs these from TLS and
calls the original comparison function with the context pointer. This
is valid assuming qsort does not run the comparisons in new threads.

> > > IIRC, what I couldn't handle within P99 was checking of printf
> > > arguments, but from within musl this should be relatively straight
> > > forward.
> > 
> > Not really. There would need to be a way to convey to the printf core
> > that it's supposed to do this extra checking, and a way to make it
> > call the constraint handlers.
> 
> This you could e.g easily to with TLS :) I'd think that for printf and
> friends this would be much less critical than for the sort
> functions. To my understanding printf functions are IO bound (or
> memory bound for sprintf), so just some switching on entry on some TLS
> wouldn't be much of an overhead, I think.

TLS is not guaranteed to exist when these functions are called;
programs not using any multi-threaded functionality are supposed to
"basically work" on Linux 2.4. I don't mind having the Annex K
functions depend on TLS, since only new programs will use them anyway,
but I don't want to break existing programs.

For fprintf_s and and fscanf_s, it would be possible to instead pass
the special mode info in the FILE structure. However this requires
re-implementing snprintf_s and sscanf_s on top of fprintf_s and
fscanf_s (i.e. duplicating the fake FILE setup), rather than just
implementing them on top of snprintf and sscanf. (v's omitted for
clarity, but obviously we're really talking about the v versions)

> > P.S. One other reason I hate Annex K is that the constraint handler
> > design is non-thread-safe and non-library-safe.
> 
> that is certainly a good point
> 
> > There's only one
> > global constraint handler, shared by all threads and by all
> > libraries/modules that might be using Annex K functions. That means
> > there's really no valid way to write code that depends on a particular
> > constraint handler being installed.
> 
> This is just meant to be like this. These interfaces are meant to give
> means to abort more or less gracefully if constraints as they are
> described in that Annex occur. They are not meant to have complicated
> games that let you "repair" faulty environments and continue
> execution.

What I was saying is that, in library code, you can't rely on this.
The application may have installed a handler that causes the functions
to just return an error, or the default implementation-defined handler
might do so.

> > And the default handler is
> > implementation-defined, so it wouldn't even be reasonable to say
> > "leave the default handler there". The only thing reasonable code
> > using these interfaces can expect when a constraint is violated is
> > implementation-defined behavior, which is only a tiny step up from
> > undefined behavior...
> 
> You are too much a library implementor :) I think it is easy for an
> application to install a different constraint handler (a standard one
> or of its own) during startup in its main, before creating any other
> thread. I see that as the principal use pattern for this, just straight
> and simple.
> 
> In particular no library should expect any particular constraint
> handler to be in place. It is up to the application to determine what
> is to be done if a constraint occurs.

Yes, I agree with your analysis here.

> > My feeling is that we should hold off on a decision about them to
> > see if any applications actually start using them.
> 
> I think we have a hen and egg problem, here. Nobody will use them if
> nobody provides an implementation.

You presume we would want people to use them. :) I don't. I think
they're very poorly designed interfaces that were crammed into the
standards process by their sponsor's clout rather than any technical
merit of existing practice. _FORTIFY_SOURCE solves pretty much the
same problems these functions were intended to solve, but does a much
better job since it doesn't rely on the application developer to
provide truthful information about object sizes, and instead gets the
compiler to do it.

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.