Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 10 Aug 2020 12:53:19 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: strftime %Z behavior with manually populated struct tm

On Mon, Aug 10, 2020 at 03:31:23PM +0200, Nikita Popov wrote:
> Hi,
> 
> Currently, strftime() %Z will print an empty string if the provided tm_zone
> does not originate from musl. It appears that this behavior is explicitly
> implemented in __tm_to_tzname(). Would it be possible to instead print any
> non-NULL tm_zone as provided?

I don't think so. The problem is that %Z is specified, albeit
underspecified, as something portable applications can pass to
strftime, but tm_zone is not standard and may be uninitialized in a
structure the caller passes. Attempting to honor it would mean
crashing on inputs where it's not present, which is nonconforming and
unacceptable.

I haven't checked, but I believe most implementations just print the
zone name from the current timezone, using tm_isdst to decide whether
to print the standard or daylight version of the name. This is
insufficient with zoneinfo for zones where the name changed over time,
where it would print the wrong name for historical times. So instead
we support printing any one of the zone names from the current zone,
if the tm_zone member points to one of them, and blank otherwise.

> For consumers manually populating the struct tm structure, it is
> non-trivial to work around the current behavior. Python introduced a work
> around in
> https://github.com/python/cpython/commit/163eca34c48f1b25e1504e37f4656773fd0fdc78,
> but it is not easy to generalize.

I think you have the wrong commit link; that one does not look
related. If you have the real one I'd be interested in seeing what
they did.

> From what I gathered, the original concern here was that a consumer
> manually initializing struct tm may not initialize the tm_zone field. As
> struct tm is only specified to contain "at least" certain members, manual
> initialization of this structure is already on shaky ground anyway and I

This is not the case. There is nothing shaky about using the interface
in the standard-specified way. The specification already labels which
struct tm fields each format uses, and it's perfectly valid to call
strftime with junk in the rest; this is even fairly common usage (e.g.
not filling in the derived fields that you'd need mktime to get).

On the other hand, it's already dubious doing anything more than
ignoring tm_zone and just using tm_isdst (which is the field the spec
labels %Z as depending on). I only did this because it's impossible to
give culturally-correct results in all zones with just tm_isdst and
because the degree of underspecification made it seem defensible.

> think it is more useful to assume the initialization is correct than try to
> deal with garbage data. Because other libc implementation do not try to
> validate the pointer either (beyond being non-NULL), incorrect
> initialization is unlikely in practice.

I think if applications want to use zones other than the actual
configured zone with strftime, they need to just do something like
expand the %Z themselves with the string they want before calling
strftime (note: this requires quoting any % in the name). I looked
hard for a better solution that wouldn't crash valid applications, and
couldn't find one at the time.

There is a proposal for a new C time API that allows zone objects
(like locale_t objects for locales, but hopefully better-designed) and
functions that take the zone as an argument. This would be a really
good solution once it's established, but I don't know the status on
it.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.