Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CFTime support for polyval #6624

Merged
merged 19 commits into from
May 31, 2022
Merged

CFTime support for polyval #6624

merged 19 commits into from
May 31, 2022

Conversation

headtr1ck
Copy link
Collaborator

commit 398f1b6
Author: dcherian <deepak@cherian.net>
Date:   Fri May 20 08:47:56 2022 -0600

    Backward compatibility dask

commit bde40e4
Merge: 0783df3 4cae8d0
Author: dcherian <deepak@cherian.net>
Date:   Fri May 20 07:54:48 2022 -0600

    Merge branch 'main' into dask-datetime-to-numeric

    * main:
      concatenate docs style (pydata#6621)
      Typing for open_dataset/array/mfdataset and to_netcdf/zarr (pydata#6612)
      {full,zeros,ones}_like typing (pydata#6611)

commit 0783df3
Merge: 5cff4f1 8de7061
Author: dcherian <deepak@cherian.net>
Date:   Sun May 15 21:03:50 2022 -0600

    Merge branch 'main' into dask-datetime-to-numeric

    * main: (24 commits)
      Fix overflow issue in decode_cf_datetime for dtypes <= np.uint32 (pydata#6598)
      Enable flox in GroupBy and resample (pydata#5734)
      Add setuptools as dependency in ASV benchmark CI (pydata#6609)
      change polyval dim ordering (pydata#6601)
      re-add timedelta support for polyval (pydata#6599)
      Minor Dataset.map docstr clarification (pydata#6595)
      New inline_array kwarg for open_dataset (pydata#6566)
      Fix polyval overloads (pydata#6593)
      Restore old MultiIndex dropping behaviour (pydata#6592)
      [docs] add Dataset.assign_coords example (pydata#6336) (pydata#6558)
      Fix zarr append dtype checks (pydata#6476)
      Add missing space in exception message (pydata#6590)
      Doc Link to accessors list in extending-xarray.rst (pydata#6587)
      Fix Dataset/DataArray.isel with drop=True and scalar DataArray indexes (pydata#6579)
      Add some warnings about rechunking to the docs (pydata#6569)
      [pre-commit.ci] pre-commit autoupdate (pydata#6584)
      terminology.rst: fix link to Unidata's "netcdf_dataset_components" (pydata#6583)
      Allow string formatting of scalar DataArrays (pydata#5981)
      Fix mypy issues & reenable in tests (pydata#6581)
      polyval: Use Horner's algorithm + support chunked inputs (pydata#6548)
      ...

commit 5cff4f1
Merge: dfe200d 6144c61
Author: Maximilian Roos <5635139+max-sixty@users.noreply.github.com>
Date:   Sun May 1 15:16:33 2022 -0700

    Merge branch 'main' into dask-datetime-to-numeric

commit dfe200d
Author: dcherian <deepak@cherian.net>
Date:   Sun May 1 11:04:03 2022 -0600

    Minor cleanup

commit 35ed378
Author: dcherian <deepak@cherian.net>
Date:   Sun May 1 10:57:36 2022 -0600

    Support dask arrays in datetime_to_numeric
@dcherian
Copy link
Contributor

I squash merged #6556, but you'll still need to update the max_computes in raise_if_dask_computes for cftime tests. It should only have to compute once.

@headtr1ck
Copy link
Collaborator Author

headtr1ck commented May 21, 2022

Maybe someone with more insight into the inners of xarray cound fix that.
I get a ValueError: `dtype` inference failed in `map_blocks`. for cftime arrays with dask.

Also, the part with the try-except around the compute seems strange, maybe there is an easier solution?

@headtr1ck
Copy link
Collaborator Author

I don't understand why it is so much more complicated than before the new algo. Is it because before only indexes were supported as coord and they are never lazy?

@dcherian
Copy link
Contributor

Is it because before only indexes were supported as coord and they are never lazy?

Yes.

The test failure is only because we are trying to test cftime in an environment without cftime.

@dcherian
Copy link
Contributor

🤦🏾 the failure is because pytest is running the xr.date_range(..., use_cftime=True) always, ignoring the skipif mark.

@headtr1ck Can you add the cftime tests as a separate test instead? This will avoid the need for a max_computes arg too. For the separate cftime test, apply the requires_cftime decorator from tests/__init__.py to skip when cftime is not present.

@headtr1ck
Copy link
Collaborator Author

🤦🏾 the failure is because pytest is running the xr.date_range(..., use_cftime=True) always, ignoring the skipif mark.

@headtr1ck Can you add the cftime tests as a separate test instead? This will avoid the need for a max_computes arg too. For the separate cftime test, apply the requires_cftime decorator from tests/__init__.py to skip when cftime is not present.

Done. Seems to work :)

@dcherian dcherian added the plan to merge Final call for comments label May 28, 2022
if x.dtype.kind in "MO":
# datetimes (CFIndexes are object type)
offset = (
np.datetime64("1970-01-01") if x.dtype.kind == "M" else _cfoffset(x)
Copy link
Collaborator

@keewis keewis May 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we get a test with a non-standard calendar / something very far from 1970 (like, 480-01-01)? I suspect that will cause the cftime support to fail because _cfoffset hard-codes the offset to 1970-01-01 (not sure, though, I might misunderstand the code).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added to copy the behavior of polyfit . But I agree that such a test can be useful. Additionally we should add an integration test for these two functions.

Copy link
Contributor

@dcherian dcherian May 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That offset choice is in v2022.03.0 too:

# Special case for non-standard calendar indexes
# Numerical datetime values are defined with respect to 1970-01-01T00:00:00 in units of nanoseconds
if isinstance(index, (CFTimeIndex, pd.DatetimeIndex)):
offset = type(index[0])(1970, 1, 1)
if isinstance(index, CFTimeIndex):
index = index.values

@dcherian dcherian enabled auto-merge (squash) May 31, 2022 17:01
@dcherian dcherian merged commit 4c92d52 into pydata:main May 31, 2022
@spencerkclark
Copy link
Member

Many thanks for restoring this functionality @headtr1ck!

@headtr1ck headtr1ck deleted the polyvalcf branch June 4, 2022 10:18
dcherian added a commit to dcherian/xarray that referenced this pull request Jun 12, 2022
* main: (95 commits)
  Use `zarr` to validate attrs when writing to zarr (pydata#6636)
  Add pre-commit hook to check CITATION.cff (pydata#6658)
  Fix kwargs used for extrapolation in docs (pydata#6639)
  Fix notebooks' HTML links (pydata#6655)
  Doc index update (pydata#6530)
  CFTime support for polyval (pydata#6624)
  Support dask arrays in datetime_to_numeric (pydata#6556)
  [pre-commit.ci] pre-commit autoupdate (pydata#6654)
  0-padded month. (pydata#6653)
  [test-upstream] import `cleanup` fixture from `distributed` (pydata#6650)
  Allow all interp methods in typing (pydata#6647)
  Typing support for custom backends (pydata#6651)
  Improved DataArray typing (pydata#6637)
  Adjust code comments & types from pydata#6638 (pydata#6642)
  Typing of `str` and `dt` accessors (pydata#6641)
  Feature/to dict encoding (pydata#6635)
  fix {full,zeros,ones}_like overloads (pydata#6630)
  Mypy badge (pydata#6626)
  concatenate docs style (pydata#6621)
  Typing for open_dataset/array/mfdataset and to_netcdf/zarr (pydata#6612)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plan to merge Final call for comments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cftime arrays not supported by polyval
4 participants