Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tz.cpp should parse the compiled form of the TZDB, not the source #1

Closed
thiagomacieira opened this issue Jul 23, 2015 · 45 comments
Closed

Comments

@thiagomacieira
Copy link

I know you're writing only as an example, but given your notoriety, your source code might actually get used elsewhere.

Please parse the compiled form of the database, after zic is done with it. In doing that, you should also set the default DB path to "/usr/share/zoneinfo", which should work just about anywhere that has a /usr directory.

This should also eliminate the need to set a minimum year to be kept in memory, as the database is already pre-compiled.

@HowardHinnant
Copy link
Owner

What advantage would this approach have?

One disadvantage I see is that it makes the install process more complicated. Now the end user has to also download, build and run zic.

@thiagomacieira
Copy link
Author

There are two basic advantages:

  • It has faster start-up time, since there's much less parsing
  • everywhere except Windows, the database is already present, so the user doesn't have to download anything or run zic. Plus, they get an updated database for free, since the system already updates it.

@thiagomacieira
Copy link
Author

For Windows, don't parse the IANA TZDB. Windows has its own database in the registry. A serious application would open "HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Time Zones". See http://msdn.microsoft.com/en-gb/library/windows/desktop/ms724253%28v=vs.85%29.aspx

@HowardHinnant
Copy link
Owner

It might be interesting to have a way to do both a direct parse, and a zic parse. That way a client doesn't have to be dependent on his OS updating the database in a timely fashion. On startup times, I'm sure you're correct. The current optimized (-O3) startup time is about 300ms on my hardware. So it isn't disastrous. But it is significant.

I am interested in porting this code to Windows. Perhaps there too, dual parsers might be interesting.
But I'll be dependent on others for porting, as I only have OS X set up as a development environment.

I'm open to this being a collaborative project (which is one of the reasons I set it up on github). Would you be interested in developing the zic parser? At this time I'm not sure how much of the existing infrastructure could be reused. Definitely everything in date.h. But I'm much less sure about tz.h and tz_private.h.

@thiagomacieira
Copy link
Author

Thanks for the offer, but I have a lot on my plate right now. Adding more responsibilities is not something I can do...

@pmachata
Copy link

+1. I maintained zoneinfo for Red Hat for several years, and it's incredible on how many places there are separate copies of this data. E.g. Java has one (of either of two versions), PHP I think has one, then there were pytz for Python which shipped one and the corresponding thing for Ruby... We patched them wherever we could to just use system data. tzdata is very volatile, changes about monthly (mostly around spring and autumn when government feel the need to meddle with daylight saving, but zone boundaries also change more often than you would expect).

I don't know about BSD's etc., but on Linux this is a core part of the OS, even glibc depends on tzdata. So I would rather bet on timeliness of OS updates than on timeliness of updates to any bundled data. Few projects have the cycles to issue 10-ish updates a year for something this obscure. Let alone timely. It's fairly common that a government suddenly wakes up and realizes, boy oh boy, we have a DST transition in a week, didn't we want to postpone it? Good luck keeping up with this. As I wrote elsewhere, please please pretty please, just use the compiled system data, whenever possible.

The binary format is actually easy to parse. I think it's all documented in man tzfile(5). I wrote a parser some time ago for the tzdiff project, see here: https://git.fedorahosted.org/cgit/tzdiff.git/tree/olson.cc .

Having said all this, I understand your position. I don't have spare cycles to write this either unfortunately.

EDIT: A typo.

@kkofler
Copy link

kkofler commented Sep 27, 2015

I can only heavily second @thiagomacieira and @pmachata. The source code of the tzdata is NOT installed by default on GNU/Linux, whereas (as pointed out by @thiagomacieira) the binary is, and @pmachata already explained why bundling the data with your library is a horrible idea. You really have to parse the system tzdata database, which is in the compiled format, there's no way around it.

@chmike
Copy link

chmike commented Sep 28, 2015

I'm currently trying to decode the TZFiles provided on my Ubuntu computer.
The first thing I noticed is that it doesn't contain the leap seconds. The second thing I noticed is that it precomputes the date when a daylight saving time starts or end. While this is convenient to use, it stops at 2017. So an update of the tzdata files is required to keep the compiled tzfiles up to date.

I'm thus not convinced that using the compiled tzfiles is a good idea. On an OS without updates this files will rapidly be outdated (i.e. Windows).

@HowardHinnant
Copy link
Owner

If this feature gets coded up, it will be an alternative, not a replacement of the way things are done today. There will be a config flag to chose where you get your tzdb from.

@thiagomacieira
Copy link
Author

@chmike: leap seconds are not part of Unix time. As for the date rules, the last rules are supposed to repeat ad infinitum. You're probably mis-interpreting the files.

If the rules exist in binary and source form, I don't see why one would waste CPU cycles in parsing text form if the binary form exists. If the OS doesn't have the rules in any form at all (e.g., Windows), you should read from the timezone DB it has, instead of shipping your non-updated rules. If the OS has the binary form and doesn't update them, the OS isn't worth using.

And I don't know of any OS that ships source form only.

Just don't bundle the rules.

@HowardHinnant: your presentation in CppCon has done some harm. Now people are thinking your code is suitable for deployment in production, replacing all other solutions. It isn't. It needs to parse binary and it needs to parse the Windows timezone DB too (see http://msdn.microsoft.com/en-gb/library/windows/desktop/ms725481%28v=vs.85%29.aspx and http://msdn.microsoft.com/en-gb/library/windows/desktop/ms724253%28v=vs.85%29.aspx)

@HowardHinnant
Copy link
Owner

@thiagomacieira: Well you better get to work and fix this then! Please don't dally. ;-)

  1. This library does not bundle the IANA database with it. The documentation at http://howardhinnant.github.io/tz.html clearly says so. The lack of the database in the github repository should also be a telling clue.
  2. The documentation also clearly states that there are two libraries here: date.h and tz.h. My presentation at Cppcon also reinforced that design, and concentrated on date.h. I mentioned tz.h just enough so that people knew it existed. I simply did not have time to do anything more than that.

And I leave it to each individual to make the decision as to whether these libraries meet their needs or not. If you are not comfortable using either of these libraries, then by all means, please don't.

@thiagomacieira
Copy link
Author

As I said before, the problem is that you're too well known. People assume that just because it came from you, it will suffice.

This came around again because a collegue saw your presentation at CppCon and posted to the Qt development mailing list saying that we should use it. To which, I replied saying there was no need as our deployed solution in QTimeZone is superior, since it reads the Windows DB, the compiled IANA DB and can also read from ICU's DB via its API.

Since my needs are already met, I don't plan on contributing here.

But I'm glad you're keeping this open for contribution by someone else who may have similar needs and cannot use Qt.

@chmike
Copy link

chmike commented Sep 28, 2015

The compiled tzfile contains precomputed time values relatives to 1970-01-01T00:00:00 UTC with the associated time offset and isdst flag set to 1 if the offset includes daylight saving time. It is the info obtained with tzset(). There is no general rule like in the source files.

Knowledge of leap seconds are needed to correctly interpret a UTC Gregorian date. You get a time of the form 23:59:60 at a leap second. Maybe the leap seconds are stored in TZif2. I didn't check yet. In my file, the table of leap seconds in TZif is empty.

The advantage of tzfile is that it is straightforward to use once unpacked. It is stored in big endian, etc.

Bien cordialement,
Ch.Meessen

Le 28 sept. 2015 à 19:07, Thiago Macieira notifications@github.com a écrit :

@chmike: leap seconds are not part of Unix time. As for the date rules, the last rules are supposed to repeat ad infinitum. You're probably mis-interpreting the files.

If the rules exist in binary and source form, I don't see why one would waste CPU cycles in parsing text form if the binary form exists. If the OS doesn't have the rules in any form at all (e.g., Windows), you should read from the timezone DB it has, instead of shipping your non-updated rules. If the OS has the binary form and doesn't update them, the OS isn't worth using.

And I don't know of any OS that ships source form only.

Just don't bundle the rules.

@HowardHinnant: your presentation in CppCon has done some harm. Now people are thinking your code is suitable for deployment in production, replacing all other solutions. It isn't. It needs to parse binary and it needs to parse the Windows timezone DB too (see http://msdn.microsoft.com/en-gb/library/windows/desktop/ms725481%28v=vs.85%29.aspx and http://msdn.microsoft.com/en-gb/library/windows/desktop/ms724253%28v=vs.85%29.aspx)


Reply to this email directly or view it on GitHub.

@thiagomacieira
Copy link
Author

The compiled tzfile contains precomputed time values relatives to 1970-01-01T00:00:00 UTC

Please read "UTC" with a grain of salt here. It's not the UTC you're thinking of that got adjusted by leap seconds. It's the current UTC extended backwards in time as if leap seconds hadn't occurred. See https://en.wikipedia.org/wiki/Unix_time#Leap_seconds.

@pmachata
Copy link

@chmike Compiled TZ files may or may not contain leap seconds depending on the way the system is set up. The standard upstream way is to compile the leap-second-aware zoneinfo files to right/ subtree of the distribution, the leap-second-unaware to posix/ subtree, and have ./ subtree contain hardlinks to either of these subtrees. Red Hat had ./* hardlink the posix/ (sans leap seconds) subtree, and I suspect others did that as well, because on Linux, NTP typically handles correct time including leap seconds.

Also @chmike, if rules end on 2017, that means either of two things. Either the rules end on 2017 and there are no more transitions defined for the given zone. Some volatile zones may have rules encoded only for present year, even if it is known a transition will take place next year as well, because there is no indication of when it will be. Another possibility is that the rule that follows is regular and can be encoded using POSIX TZ string. You need to read that and parse that as well, which is annoying, but at least the format is well-defined as well. Yet another possibility is that you really mean 2037, which is far enough in the future that it doesn't seem important. It's an arbitrary cut-off and the source distribution has flags, IIRC, to set it. Due to Y2037 problem the 32-bit portion of zoneinfo file can't encode timestamps in more distant future, but the 64-bit one (which has been shipped for close to a decade now) has no problem.

And finally @chmike relying on compiled form has an advantage that your system will be kept up to date by OS vendor.

@HowardHinnant I appreciate that you don't ship your own zoneinfo data, no irony. But the source form is not distributed by OS vendors, the binary form is. So whoever ends up using using your library will necessarily have to ship it themselves, otherwise the library is no use. I have no problem if the library is used only locally, but as soon as you end up distributing dependent code to end users, the timely support becomes a pain. At the same time, this issue is often not recognized, because people underestimate how volatile this data is, and the timely support ends up being simply absent. If you could at least point out these issues (or perhaps point at this thread) in the documentation, that would be very helpful--we've really seen way, way more data duplication in this area than there should be. And these bugs are somewhat rare, so it's hard to realize you even have this problem. Essentially you'd get a bunch of bug reports every now and then, when one of your customers happens to somehow interface with a country that meddles with this stuff.

@HowardHinnant
Copy link
Owner

http://howardhinnant.github.io/tz.html#Installation

@chmike
Copy link

chmike commented Sep 29, 2015

1970-01-01T00:00:00 UTC is the posix epoch date time by definition as of
https://en.wikipedia.org/wiki/Unix_time. UTC was indeed only defined in
in 1972.

It is still unclear to me what to do before the 1 Jan 1972 (63072000
unix time) to compute the TAI time from the unix time. Should I add 9
sec or 10sec ?

Le 28/09/2015 22:23, Thiago Macieira a écrit :

The compiled tzfile contains precomputed time values relatives to
1970-01-01T00:00:00 UTC

Please read "UTC" with a grain of salt here. It's not the UTC you're
thinking of that got adjusted by leap seconds. It's the current UTC
extended backwards in time as if leap seconds hadn't occurred. See
https://en.wikipedia.org/wiki/Unix_time#Leap_seconds.


Reply to this email directly or view it on GitHub
#1 (comment).

Bien cordialement,

Ch.Meessen

@chmike
Copy link

chmike commented Sep 29, 2015

@pmachata I checked again and you are fully right. My bad! The time values are until 2037, not 2017. The TZ text rule is indeed stored in the file. I didn't decode it yet, but is easy to spot with hexdump. I also confirm that the leap seconds are present in the files under the right directory and that the file /etc/localtime on my Ubuntu 15.04 (Debian based) is not the file from right directory.

I also fully agree that relying on automatically updated files is better when possible. When there is no such update, relying on the compiled files for the local time offsets is still OK because the data is valid until 2037 if the rule doesn't change. The compiled files are also straightforward to use. The only problem, in absence of automatic update, are the leap seconds.

@pmachata
Copy link

@HowardHinnant Cool, thanks.

@chmike So the point is that the rules actually do change very often. Even if you have rules all the way to 2037, chances are they will get obsolete much more quickly. Depending on the region your customers live in or deal with, it might take some time (e.g. European or US rules seem to be stable), or it might be a surprising spur-of-the-moment thing (e.g. South American countries seem to be more prone to changing DST rules at the last minute, per my experience). Even the US change in, was it 2005?, which was being widely announced a long time before it was effective, ended up surprising all sorts of devices that were clever enough to know about DTS, but not clever enough to know it occasionally changes. Even public transportation buses in San Francisco if my memory serves right, which ended up displaying the wrong time for a week. So if you can't rely on vendor-provided updates, I suspect you really want to make sure you have an update vector yourself, because these things do change, and people underestimate how often they do.

HowardHinnant pushed a commit that referenced this issue Nov 28, 2015
Merged mainstream repo commits
@HowardHinnant
Copy link
Owner

Update: This library still only reads the source format, not the binary format. But now, if linked against curl (https://curl.haxx.se/libcurl/) which is available for Windows and comes pre-installed on Linux and OS X, the library can be configured to update to the latest database automatically. Fully documented here:

http://howardhinnant.github.io/date/tz.html#Installation

@HowardHinnant
Copy link
Owner

I just did a comparative survey of the compiled tzdata files on macOS using Google's cctz library. I find that 63% of the OS-supplied timezones have errors in either offset, abbreviation, or both when queried with timestamps outside the range of years [1900, 2037]. On a positive note, 37% of the OS-supplied timezones were free of errors.

The survey correlated to those timezones which have offset transitions outside the range [1900, 2037]. I have a report that suggests these same errors exist on iOS.

By reading the text IANA files, this library is immune to these errors.

@thiagomacieira
Copy link
Author

Any data from before 1970 is "best effort" and can contain errors.

Are you saying that the data compiler is making mistakes? Was that zic?

@HowardHinnant
Copy link
Owner

HowardHinnant commented Aug 12, 2016

I realize that data from before 1970 is "best effort" and can contain errors. But that's not a good rationale for introducing further errors. This library accurately reports all of the data in tzdata, making no judgement on the quality or suitability of that data, nor excuses for not accurately presenting it.

At this time, I do not know for sure if the errors reported to me via cctz's use of my OS's zic-compiled data files are the result of errors in cctz, or errors in zic. I suspect the latter. I also suspect that said errors are fixed in zic, but require either an update or configuration change that Apple has not done. The range of accuracy of the compiled data is suspiciously close to the range of a signed 32 bit count of seconds from 1970. Emphasis: These are all guesses on my part except for the fact that I've detected the errors.

The example that brought this issue to my attention was:

--- Europe/Berlin ---
1873-01-01 00:00:00 UTC
offset   abbreviation
00:53:28 LMT    // In the text tzdata file europe
01:00:00 CET    // Apparently reported by zoneinfo/Europe/Berlin

@thiagomacieira
Copy link
Author

I'd be really surprised if the mistake is in zic: it's created by the same people who maintain the database itself. More than likely, the error is in the decoder tool. I had a similar bug report sent to me on QTimeZone that we failed to parse the name of a timezone (Asia/Barnaul) after tzdata2016d. It's also probably restricted to 32-bit integers, so it simply can't represent dates before 1902.

Other tools do work:

$ zdump -v Europe/Berlin | head -5
Europe/Berlin  -9223372036854775808 = NULL
Europe/Berlin  -9223372036854689408 = NULL
Europe/Berlin  Fri Mar 31 23:06:31 1893 UT = Fri Mar 31 23:59:59 1893 LMT isdst=0 gmtoff=3208
Europe/Berlin  Fri Mar 31 23:06:32 1893 UT = Sat Apr  1 00:06:32 1893 CET isdst=0 gmtoff=3600
Europe/Berlin  Sun Apr 30 21:59:59 1916 UT = Sun Apr 30 22:59:59 1916 CET isdst=0 gmtoff=3600

@HowardHinnant
Copy link
Owner

HowardHinnant commented Aug 12, 2016

Thanks for the zdump tutorial.

zdump -v Europe/Berlin | head -5
Europe/Berlin  Fri Dec 13 20:45:52 1901 UTC = Fri Dec 13 21:45:52 1901 CET isdst=0
Europe/Berlin  Sat Dec 14 20:45:52 1901 UTC = Sat Dec 14 21:45:52 1901 CET isdst=0
Europe/Berlin  Sun Apr 30 21:59:59 1916 UTC = Sun Apr 30 22:59:59 1916 CET isdst=0
Europe/Berlin  Sun Apr 30 22:00:00 1916 UTC = Mon May  1 00:00:00 1916 CEST isdst=1
Europe/Berlin  Sat Sep 30 22:59:59 1916 UTC = Sun Oct  1 00:59:59 1916 CEST isdst=1

OS X El Capitan 10.11.6

Yep, Fri Dec 13 20:45:52 1901 UTC is exactly the signed 32 bit second lower range.

@thiagomacieira
Copy link
Author

Looks like zdump on a Mac shows the LMT date at the (time_t)INT_MIN point. Why? I don't know...

I've confirmed that the zdump binary on my Mac has both 32- and 64-bit architectures inside the fat binary. The previous output I had pasted came from a 64-bit build on Linux.

Looks like they're different sources too:

mac$ zdump --version
zdump: @(#)zdump.c      7.31
linux$ zdump --version
zdump (tzcode) 2016f

@HowardHinnant
Copy link
Owner

@thiagomacieira
Copy link
Author

I understand Apple replacing GNU tools that are under the GPLv3, but why this one? http://www.iana.org/time-zones has a more up-to-date version licensed under the BSD licence...

@vlovich
Copy link

vlovich commented Oct 6, 2016

Hmm.... time to file a Radar? http://bugreport.apple.com

@HowardHinnant
Copy link
Owner

@vlovich: Be my guest. :-) My user-experience with filing bug reports with Apple has not been very good.

@bullestock
Copy link

For Windows, don't parse the IANA TZDB. Windows has its own database in the registry. A serious application would open "HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Time Zones"

I briefly explored that option, but there is at least one issue: The Windows registry does not use the Olsson time zone names, but rather e.g. "Central Europe Standard Time".

@thiagomacieira
Copy link
Author

See CLDR file supplemental/windowsZones.xml

@ta1meng
Copy link

ta1meng commented Nov 7, 2016

RE: "If this feature gets coded up, it will be an alternative, not a replacement of the way things are done today. There will be a config flag to chose where you get your tzdb from."

We are fairly new to date time programming, and Howard's libraries are meant to be used by a wide audience. Learning curve should be a consideration.

We are very grateful that the tzdata files are human readable. With low effort, we could understand and reason about them. It was helpful to see that some historical timezone rules were put in inferentially due to incomplete records. Inline comment threads seem to detail unimplemented exceptions and dubious rules, which would come in handy if a customer asks us "why is the date time math wrong over here and no where else?". Having the ability to add or modify a rule on short notice is a nice to have.

We looked at other timezone rule specs as part of our learning process:

  • Boost: date_time_zonespec.csv was human readable. We can immediately see that it has no support for historical rules.
  • TCL's tzdata was in text format but not particularly human readable.
  • Python's zoneinfo was in binary format and not human readable.
  • PostgreSQL's timzone info was in binary format. It was a bit different from Python's binaries, and we couldn't easy tell what the differences were.

We learned more from the human readable tzdata files than all the other sources combined. So, a +1 for keeping the loading of text versions of tzdata as the default behavior. I can see the viewpoint that many would benefit from the binary file support and so would also be very happy with a toggle to read from binary tzdata.

RE: "For Windows, don't parse the IANA TZDB. Windows has its own database in the registry. A serious application would open "HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Time Zones". See http://msdn.microsoft.com/en-gb/library/windows/desktop/ms724253%28v=vs.85%29.aspx"

Thiago's point on Windows timezone support is an interesting one and we'll be thinking about it. My belief (no hard facts yet) is that most of our customers store their date time values in databases and most databases that support timezone regions seem to adhere to the IANA database. So even on Windows, I think our users would predominantly need tzdata support. I'll be contacting customer support to see if we have significant user date time data specified with Windows timezones.

@thiagomacieira
Copy link
Author

thiagomacieira commented Nov 7, 2016

We learned more from the human readable tzdata files than all the other sources combined. So, a +1 for keeping the loading of text versions of tzdata as the default behavior. I can see the viewpoint that many would benefit from the binary file support and so would also be very happy with a toggle to read from binary tzdata.

The binary data is the same as the source text, only in a more-easily parseable format. Sure, the comments would be missing, but if you need to investigate why something is not what you expected, you can look at the source. For quick information, you can use zdump.

Let me also point out that you don't want to modify the tzdata files in any way. Those are updated by IANA several times a year, so if you made modifications, you'd very quickly diverge from the authoritative source. And besides, what reason would you have for having local changes to your understanding of something that is global?

Thiago's point on Windows timezone support is an interesting one and we'll be thinking about it. My belief (no hard facts yet) is that most of our customers store their date time values in databases and most databases that support timezone regions seem to adhere to the IANA database. So even on Windows, I think our users would predominantly need tzdata support. I'll be contacting customer support to see if we have significant user date time data specified with Windows timezones.

As I said, the CLDR has a table mapping Windows names to IANA names. But I do agree that the quality of the Windows TZDB is much lower than the IANA DB. My point was that a serious application could opt not to ship the 3 MB of binary tzdata, as Windows already has most of the information. I believe size considerations would be very proeminent for anyone who wants tz.cpp, as opposed to using a full framework like ICU.

@HowardHinnant
Copy link
Owner

The binary data is the same as the source text, only in a more-easily parseable format.

This isn't strictly true on macOS because Apple is using 8 year old source files to compile an up-to-date text database into the binary database.

I also don't see the leap second information in the macOS binary database. They've recently started shipping the version information which is a positive step. Also it is more difficult, but not impossible, to retrieve the timezone name from the binary files and provide the ability to iterate over the set of timezones. And there is no distinction between links and time zones in the binary format (which is probably not important for most customers).

On Windows one doesn't have the IANA database at all, so it must be installed (maybe that will change in the future). Either that, or one is forced to admit that time zone names and definitions are not portable across platforms.

The only thing this library uses the CLDR for on Windows is to implement current_zone() which must translate the current Windows time zone into an IANA time zone.

Fwiw, the text form of the IANA database is about 945Kb and covers an infinite temporal range.

Should this library be standardized, almost certainly this issue will become moot as the std::lib implementor will supply the database, and whether it is binary or text will become an implementation detail. One of the biggest issues for standardization is whether or not to support reload_tzdb(). I'm guessing they will choose not to, but anything could happen.

@thiagomacieira
Copy link
Author

One of the biggest issues for standardization is whether or not to support reload_tzdb(). I'm guessing they will choose not to, but anything could happen.

Given that an application can run for more than ~1.7 months without being restarted, there needs to be a way to refresh the database. Either the implementation does it behind the scenes and thread-safely, or it offers a thread-safe API to do so at the app's discretion.

@galik
Copy link

galik commented Nov 8, 2016

I was wondering if it wouldn't be better to provide a lazy API whereby the user initializes the database supplying which time-zones it is interested in (as defaults) and then on an as-required basis. Why parse the whole database when many apps will only use one or two actual timezones?

@HowardHinnant
Copy link
Owner

Given that an application can run for more than ~1.7 months without being restarted...

I've seen that position expressed on the committee (by a single person). But I don't know at this point if that position has widespread support). The status-quo (the C API) doesn't support this feature in practice. tzdatabase updates typically require a reboot.

@thiagomacieira
Copy link
Author

Given that an application can run for more than ~1.7 months without being restarted...

I've seen that position expressed on the committee (by a single person). But I don't know at this point if that position has widespread support). The status-quo (the C API) doesn't support this feature in practice. tzdatabase updates typically require a reboot.

The C API does support it, though it's thread-unsafe: the tzset() function. Granted, it's POSIX, not ISO C, but so are localtime_r, which is required in a multihreaded application anyway.

@HowardHinnant
Copy link
Owner

I was wondering if it wouldn't be better to provide a lazy API whereby the user initializes the database supplying which time-zones it is interested in (as defaults) and then on a as-required basis.

This API allows lazy initialization. This implementation takes a hybrid approach which is optimized for the text form of the database. It parses the entire database, but omits a post-parse analysis of the data for each time zone until it is actually used. It was experimentally determined that this post-parse analysis was the most expensive step in the parse. This characteristic is actually controllable at the moment with an undocumented flag LAZY_INIT which is defaulted to 1. I will probably eventually remove this flag and hard code its effects as always on, as there doesn't appear to be a demand for turning it off.

@crusader-mike
Copy link

@HowardHinnant -- don't listen to these "I have a better tool" guys. They had 50 years to make it right -- and it is still pathetic. This library is plain simply awesome, you've properly solved the problem we couldn't since very beginning -- that is how to convert from one arbitrary TZ to another. I remember dealing with this problem multiple times.

I expect it will lead to propagation of those IANA database copies until OS guys give up and implement it proper (and uniformly) across all platforms that still live.

@vinniefalco
Copy link

...the problem is that you're too well known. People assume that just because it came from you, it will suffice.

Having worked with Howard for a few years now, I can say from experience that this assumption is correct. Everything that Howard produces is well thought out and high quality.

@HowardHinnant
Copy link
Owner

This has been open for over a year, and there's no fix on the horizon. Closing as "not going to fix." Feel free to re-open this issue if you see value in doing so.

@HowardHinnant
Copy link
Owner

This is now an option with this commit: a610f08

See https://howardhinnant.github.io/date/tz.html for USE_OS_TZDB to turn this option on.

@HowardHinnant
Copy link
Owner

Kudos to Aaron Bishop for driving this effort.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests