Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decoding times in num2date exactly with timedelta arithmetic #171

Closed
spencerkclark opened this issue May 10, 2020 · 5 comments · Fixed by #176
Closed

Decoding times in num2date exactly with timedelta arithmetic #171

spencerkclark opened this issue May 10, 2020 · 5 comments · Fixed by #176

Comments

@spencerkclark
Copy link
Collaborator

Timedelta arithmetic in cftime uses integer arithmetic, which is precise down to the microsecond. If passed an array of integers, or an array of floats that can safely be cast to integers, would it be possible for num2date to default to decoding times using that instead of the current imprecise floating point method? This would be a helpful improvement to the user experience in downstream libraries like xarray, where microsecond noise introduced in the times by floating point arithmetic can interfere with automatic alignment. See also other issues in this repo related to decoding precision, e.g. #78, #134.

This gist is a simplified sketch of how this code path could be implemented (I'm happy to develop it further into a PR if it seems promising). The idea would be to try to use this method first, and fall back to using the existing floating point method if it wasn't possible. The only reservation I would have about this is that it is about 9x slower than decoding times using floats. In my use-cases, however, this slowdown would gladly be accepted if it meant times would be decoded exactly, but perhaps others decoding more than 10000 times at a time would find this frustrating.

Are there any other reasons I'm not thinking of for why this might not be a good idea?

@jswhit
Copy link
Collaborator

jswhit commented May 11, 2020

@spencerkclark - a reworking of the date calculations is long overdue, so a pull request would be welcome. Probably best if it's implemented as an non-default option at first (especially if it's 10x slower).

@spencerkclark
Copy link
Collaborator Author

Sounds good -- when I get a chance I'll try and put something more polished together.

@jswhit
Copy link
Collaborator

jswhit commented May 29, 2020

Is it possible to have an exact inverse of this (date2num_exact) so the round-trip works without loss of precision?

@jswhit
Copy link
Collaborator

jswhit commented May 29, 2020

Something like this should work as an inverse:

def date2num_exact(dates,units,calendar='standard'):
    calendar = calendar.lower()
    basedate = to_calendar_specific_datetime(_dateparse(units), calendar)
    unit, ignore = _datesplit(units)
    if unit not in UNIT_CONVERSION_FACTORS:
        raise ValueError("Unsupported time units provided, {!r}.".format(unit))
    if unit in ["months", "month"] and calendar != "360_day":
        raise ValueError("Units of months only valid for 360_day calendar.")
    factor = UNIT_CONVERSION_FACTORS[unit]
    deltas = dates-basedate
    times = (deltas/timedelta(microseconds=1)) / factor
    return times

with some logic to handle masked arrays?

@spencerkclark
Copy link
Collaborator Author

Is it possible to have an exact inverse of this (date2num_exact) so the round-trip works without loss of precision?

That would be great too. The tricky part there is that currently subtracting a date from another to produce a timedelta is not microsecond-exact:

In [1]: import cftime

In [2]: cftime.DatetimeNoLeap(2000, 1, 2, 0, 0, 0, 5) - cftime.DatetimeNoLeap(2000, 1, 2)
Out[2]: datetime.timedelta(microseconds=8)

This is another issue I'd be keen to address. I haven't thought about the best way to do this. We currently use a bit of a hack in xarray to work around this -- recall the function in this issue -- but maybe there's a more elegant / direct way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants