Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Sun, 05 Sep 2021 15:36:12 +0200
From: Sören Tempel <>
Subject: tzset() cannot handle arbitrary inputs


This is a follow-up to a discussion on the IRC channel regarding musl's
tzset() implementation. I did some tests of the tzset() code. The code
basically provides parsers for two kinds of input formats:

	1. TZ env values (e.g. in the TZ POSIX format [1])
	2. Zoneinfo files as provided by the IANA [2]

As part of my performed tests I found various spatial memory safety
violations in these two parsers. Both of them parse input through a
pointer which is continuously advanced using pointer arithmetic and then
dereferenced to access individual fields of the input formats.
Unfortunately, the parsing code is largely lacking boundary checks to
ensure that the pointer is still in bounds when dereferenced.

As an example, consider the attached zoneinfo file. This file will cause
musl to perform an out-of-bounds memory read which likely results in a
segmentation fault on most systems:

	$ TZ=./zonefile-musl-crash.bin busybox date
	Segmentation fault (core dumped)

This particular zoneinfo file causes a segmentation fault due to the
calculation of the types pointer:

	types = index + zi_read32(trans-12);
	(gdb) p zi_read32(trans-12)
	$1 = 4286962800

The resulting types pointer value is outside the bounds of the mapped
zoneinfo file (which is only 2048 bytes large). The value is also
dereferenced later as part of the following code:

	for (p=types; p<abbrevs; p+=6) {
		if (!p[4] && !__tzname[0]) {
			// …
		// …

Accessing p[4] will likely cause a segmentation fault. The problem here
is simply that the entire parsing code for the zonefile does not perform
any boundary checks at all. The TZ parsing code has similar issues. For
example, consider a TZ value such as `TZ=FOO-,M`. The musl
implementation interprets this as a TZ value in POSIX format. As such,
it expects 3 integers after the 'M' character (Mm.n.d). However, the
getrule code does not check whether any characters are actually present
after the 'M' character:

	if (r=='M') {
		++*p; rule[1] = getint(p);
		++*p; rule[2] = getint(p);
		++*p; rule[3] = getint(p);

As such, this code reads beyond the bounds of the TZ environment
variable. This will likely not cause a crash but still illustrates the
lack of bounds checks in the parser. These are just two examples, there
are actually more spatial memory safety violations in that code. I have
not checked whether any of these spatial memory safety violations could
be exploited.

From the aforementioned discussion on IRC, the underlying assumption
seems to be that both the TZ value as well as any parsed zonefiles are
always well-formated. Keep in mind that paths to arbitrary zonefiles can
be passed using the `TZ=:${file}` syntax. For this reason, I don't think
that this a good assumption to make. For example, calendar application
such as calcurse [3] set $TZ to values extracted from .ical files which
may be attacker-controlled [4]. Of course it would be desirable for
applications to validate this data but they don't always do so. Even
if they do, the validation code may be buggy and not be 100% aligned with
the inherent assumptions musl's tzset() implementation makes about the
format of these values. As such, I believe that musl should never
perform, potentially exploitable, spatial memory safety violations on
arbitrary TZ and zoneinfo file input values.



Download attachment "zonefile-musl-crash.bin" of type "application/octet-stream" (2048 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.