Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for "zic -b slim" tzfiles (default format since tz 2020b) #48

Open
diabonas opened this issue Oct 9, 2020 · 9 comments
Open

Comments

@diabonas
Copy link

diabonas commented Oct 9, 2020

tz switched the default format from zic -b fat to zic -b slim in the 2020b release. The tzfile.py parser in this project currently does not support this file format and silently chokes on tzfiles generated by the latest tz version, leading to incorrect tzinfo Python objects that are just using the default UTC time. To future-proof against the Y2038 problem, it would be great if pytz could support the new 64-bit only file format that zic -b slim produces.

Downstream Arch Linux bug report: FS#68150. Note that Arch patches pytz to use the system timezone database, so this problem doesn't currently appear in a default installation of pytz, which comes bundles with tzfiles generated by an older version of tz. Regardless it would be good to support to add the new file format for the reason outlined above.

@pganssle
Copy link

@diabonas I am not a maintainer of this, but I think @stub42 has said in the past that he doesn't plan to support version 2+ files.

In Python 3.9, we added the zoneinfo module to the standard library, which does support modern TZif files. There is a backport available to Python 3.6. It is best to try to migrate people to use zoneinfo.

Though I know @stub42 has mentioned before the possibility of pytz becoming a wrapper around zoneinfo. If that happens, then 64-bit support would come for free. When creating pytz_deprecation_shim (which is pretty close to what Stu described), I found that it was not possible to make a thin wrapper with zoneinfo semantics that is fully backwards-compatible, but a version of pytz with zoneinfo at its core could probably be made to work on Python 3+.

@eli-schwartz
Copy link

We're aware of zoneinfo, and this new python 3.9 feature is a source of happiness for distro packagers. I did check the reproducer to see if zoneinfo was buggy the same way and was happy to discover it was not.

It is best to try to migrate people to use zoneinfo.

I agree. stdlib builtins are always good to rely on.

This bug report merely hopes to propose that for the dozens of packages that are not yet ported, and which force the distro to continue to provide pytz, some solution is available to allow tzdata to use slim files.

The alternative is of course to vendor fat ones into pytz, resulting in old timezone info inconsistent with the system... this is non-ideal.

@pganssle
Copy link

Probably the most prudent option for Arch Linux is to deploy -b fat tzdata for some time longer. The difference between the two is that the -b slim version has smaller file sizes at the cost of effectively taking this aspect of the 2038 problem and triggering it right now.

Yes, the upstream default has changed to use -b slim, but I don't think there is any short term plan to remove the fat option.

The alternative is of course to vendor fat ones into pytz, resulting in old timezone info inconsistent with the system... this is non-ideal.

Why do you think it would be old and inconsistent? pytz does vendor its own time zone data, but you are presumably are manually de-vendoring it. If you're de-vendoring and re-vendoring the data anyway (and you don't want to deploy the fat binaries by default), you could create a tzdata-fat package that deploys the time zone data to /etc/share/zoneinfo-fat/ and change your patch to look there for the TZif files. There's no reason why it would need to be old or inconsistent.

In any case, this is another manifestation of #31, which @stub42 said is basically a wontfix: #31 (comment)

@eli-schwartz
Copy link

Rather than devendoring and revendoring, we would probably just remove the devendoring routine...

If it is WONTFIX, then I guess it is what it is. I guess we'll see how projects adapt and migrate, and stick with fat tzfiles for now.

@stub42
Copy link
Owner

stub42 commented Nov 2, 2020

The plan is for pytz to start using Python 3.9 zoneinfo and the backports, which should provide support for Python 3.6+. Python 2.4 -> 3.5 will be stuck requiring 'fat' files (which is fine, because that is what the old deployments have). Hopefully I or someone can prove @pganssle wrong and it can be fully backwards-compatible. If we can't get at least a mostly backwards-compatible version using Py3.9 zoneinfo, then we need a new pytz deprecation plan or embed a modified version of the Py3.9 code.

@pganssle
Copy link

pganssle commented Nov 2, 2020

Hopefully I or someone can prove @pganssle wrong and it can be fully backwards-compatible.

To be clear, you can write a version of pytz that is fully backwards-compatible, it just can't also be forwards-compatible. You can write a wrapper where you delegate all the logic to zoneinfo fairly easily, and you can even write a wrapper where you pull all the data from pytz's embedded tzdata instead of the system zoneinfo / tzdata. What you can't do is to allow attaching pytz zones with tzinfo= and still have arithmetic and comparison and such work the same way.

Making a version of pytz that still requires .localize and .normalize and whatnot but calls out to zoneinfo under the hood is not a major issue, you would just store a zoneinfo.ZoneInfo as part of the time zone, and then use it in .localize and .normalize to decide which time zone to attach.

@eli-schwartz
Copy link

I'm sure keeping pytz's API and output identical to its current behavior will be fine and no one is going to complain. :)

Getting support for slim tzinfo files by hooking into zoneinfo's parser and the related infrastructure to search for system or pypi tzdata, while making no other changes, would still be a plus in my book. Additional changes could be added as a follow-up, if deemed practical.

@pganssle
Copy link

pganssle commented Nov 2, 2020

I'm sure keeping pytz's API and output identical to its current behavior will be fine and no one is going to complain. :)

Yes, being backwards and forwards compatible would be a benefit only if the intention was to continue supporting pytz indefinitely, rather than adding deprecation warnings once the backend switches to use zoneinfo. If you are forwards compatible, you can support PEP 495 and work the way the documentation says time zones work and people don't necessarily need to switch away from pytz.

Even if pytz itself is deprecated, without forwards-compatibility, it makes the transition is more difficult. In pytz_deprecation_shim, the trade-off is that there are some situations where the API breaks, but in most cases it works if you treat the zone as either a pytz zone or a zoneinfo zone (and it always works if you treat it like a zoneinfo zone). That way you can adopt the new style in package A but not force package B (which depends on package A) to switch over at the same time.

It's definitely a net positive if pytz uses zoneinfo as its backend even without the forwards compatibity (as I think I mentioned in my original comment).

@rune-bk
Copy link

rune-bk commented Feb 25, 2021

Is there an ETA for when this is implemented? We are having some issues after 2038 and would very much like to see these files supported. Is there a workaround for this we can apply in the meantime?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants