musl - Re: Weekly reports

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110613025655.GA21653@openwall.com>
Date: Mon, 13 Jun 2011 06:56:55 +0400
From: Solar Designer <solar@...nwall.com>
To: musl@...ts.openwall.com
Subject: Re: Weekly reports - B

On Sun, Jun 12, 2011 at 10:22:21PM -0400, Rich Felker wrote:
> On Mon, Jun 13, 2011 at 06:11:30AM +0400, Solar Designer wrote:
> > Sorry to remind you, but we need Luka's code placed under an Open Source
> > license - and not only when cluts is "finished".  Each week's update
> > must be properly licensed.  Can one or both of you please propose a
> > license you're comfortable with?
> 
> Let's make it (new) BSD. Is that okay?

Sounds fine.  Thanks.

(I had mentioned my very slightly different preference before.  I won't
repeat that.)

> > Some assorted comments on the code, in arbitrary order:
> > 
> > For jumping out of a signal handler, you need to use sigjmp_buf,
> > sigsetjmp(), and siglongjmp().
> 
> This only matters if you want the signal mask to be restored, which we
> DO want,

Right, that's what I meant.

> but another way to achieve the same thing would be to install
> the signal handler with SA_NOMASK so the SIGSEGV never gets masked to
> begin with (another SIGSEGV should not happen inside the signal
> handler, and if it did while it was blocked, we'd be screwed anyway).
> 
> BTW another way to restore the signal mask, especially if you want it
> to be restored to the mask at the time the signal was generated rather
> than at the time the jump buffer was created, is to use the SA_SIGINFO
> signal handler form and read the saved sigset_t from the ucontext_t
> argument and restore it yourself with sigprocmask. :-)

Thank you for mentioning these.

For cluts, I think it'd be best not to go for them, though - they might
make cluts unnecessarily less portable.

> > Even so, some failed libc functions
> > might leave stdio (or something else) in an inconsistent state.  This is
> > probably irrelevant to simple string functions testing, but it will be
> > relevant to some other tests.  Thus, since we don't expect SIGSEGVs to
> > be frequent, maybe it'd be better to switch to forking child processes
> > (which must print something specific to fd 1 to indicate success)?
> > Or we can use both approaches - in different cases, as appropriate.
> 
> In the case of testing string functions, the test framework setup a
> very narrow class of "likely causes" for the SIGSEGV, and unless the
> functions are hopelessly broken, we can assume any SIGSEGV was caused
> by the condition that was being tested for. Therefore, in this case I
> don't think we have to worry about corrupt state and such.

Right.  I primarily wanted to bring the issue up early on, because Luka
will need to arrive at an approach to use for cluts tests in general.

> Note that
> POSIX does not require string functions to be async-signal-safe, for
> some odd reason, but as far as I know all real-world implementations
> including glibc guarantee that they are (I found a discussion of glibc
> strstr optimization where use of malloc was rejected because it would
> violate their requirement that they want it to be async-signal-safe).
> Thus they should not have any internal state that could get corrupted.

That's curious.  Thanks for sharing.

> > What do you mean by "#define _XOPEN_SOURCE 9001"?  I think the highest
> > value currently defined is 700, and going too high may actually prevent
> > this from working (e.g., on Solaris).
> 
> I noticed this too. Also you're defining it after including headers,
> which has no effect but invoking UB. To use feature test macros they
> must be defined before any system headers are included.

Sure.  I overlooked that detail in Luka's source.

> > Please avoid assignments to errno.  Use your own variable instead.
> 
> Is this just a stylistic preference, or do you have a reason it could
> be problematic?

Mostly a stylistic preference.  errno is defined to have a specific kind
of values assigned to it, by libc.  Reusing a variable for something
different than its original purpose makes the source code more difficult
to understand (even if very slightly).

As to actual potential issues:

IIRC, some ancient versions of glibc didn't allow programs to assign to
errno at all, because it was declared as a macro (not a variable).  That
was broken because there are in fact some cases when you need to zero
out errno before making certain libc calls to be able to distinguish
their different errors.  For example, with strtol().  Also, there are
cases when you need to save/restore errno (such as in a signal handler).
(Hmm, maybe I recall only part of the story, and there was something
enabling those things to work...)

Is it guaranteed that errno is preserved across libc calls that complete
without error?  Maybe not.  I don't really know, and I'd prefer not to
depend on that.

Generally, I think it is appropriate to limit uses of errno to checking
it immediately after a failed libc call, to zeroing it right before
certain libc calls, and to saving/restoring its value.  Oh, there's one
more: in portability functions, it is OK to set errno to specific -EXXX
values as appropriate to implement whatever function is being replaced.

Just an opinion.

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.