Skip to content

Commit

Permalink
Merge branch 'master' into yohai-ds_scatter
Browse files Browse the repository at this point in the history
* master:
  remove xfail from test_cross_engine_read_write_netcdf4 (pydata#2741)
  Reenable cross engine read write netCDF test (pydata#2739)
  remove bottleneck dev build from travis, this test env was failing to build (pydata#2736)
  CFTimeIndex Resampling (pydata#2593)
  add tests for handling of empty pandas objects in constructors (pydata#2735)
  dropna() for a Series indexed by a CFTimeIndex (pydata#2734)
  deprecate compat & encoding (pydata#2703)
  Implement integrate (pydata#2653)
  ENH: resample methods with tolerance (pydata#2716)
  improve error message for invalid encoding (pydata#2730)
  silence a couple of warnings (pydata#2727)
  • Loading branch information
dcherian committed Feb 4, 2019
2 parents 7392c81 + 27cf53f commit 4e41fc3
Show file tree
Hide file tree
Showing 24 changed files with 899 additions and 131 deletions.
3 changes: 2 additions & 1 deletion .github/stale.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ staleLabel: stale
# Comment to post when marking as stale. Set to `false` to disable
markComment: |
In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity
If this issue remains relevant, please comment here; otherwise it will be marked as closed automatically
If this issue remains relevant, please comment here or remove the `stale` label; otherwise it will be marked as closed automatically
# Comment to post when removing the stale label.
# unmarkComment: >
Expand Down
2 changes: 0 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ matrix:
- EXTRA_FLAGS="--run-flaky --run-network-tests"
- env: CONDA_ENV=py36-dask-dev
- env: CONDA_ENV=py36-pandas-dev
- env: CONDA_ENV=py36-bottleneck-dev
- env: CONDA_ENV=py36-rasterio
- env: CONDA_ENV=py36-zarr-dev
- env: CONDA_ENV=docs
Expand All @@ -31,7 +30,6 @@ matrix:
- CONDA_ENV=py36
- EXTRA_FLAGS="--run-flaky --run-network-tests"
- env: CONDA_ENV=py36-pandas-dev
- env: CONDA_ENV=py36-bottleneck-dev
- env: CONDA_ENV=py36-zarr-dev

before_install:
Expand Down
24 changes: 0 additions & 24 deletions ci/requirements-py36-bottleneck-dev.yml

This file was deleted.

2 changes: 2 additions & 0 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,7 @@ Computation
Dataset.diff
Dataset.quantile
Dataset.differentiate
Dataset.integrate

**Aggregation**:
:py:attr:`~Dataset.all`
Expand Down Expand Up @@ -321,6 +322,7 @@ Computation
DataArray.dot
DataArray.quantile
DataArray.differentiate
DataArray.integrate

**Aggregation**:
:py:attr:`~DataArray.all`
Expand Down
14 changes: 12 additions & 2 deletions doc/computation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -240,6 +240,8 @@ function or method name to ``coord_func`` option,
da.coarsen(time=7, x=2, coord_func={'time': 'min'}).mean()
.. _compute.using_coordinates:

Computation using Coordinates
=============================

Expand All @@ -261,9 +263,17 @@ This method can be used also for multidimensional arrays,
coords={'x': [0.1, 0.11, 0.2, 0.3]})
a.differentiate('x')
:py:meth:`~xarray.DataArray.integrate` computes integration based on
trapezoidal rule using their coordinates,

.. ipython:: python
a.integrate('x')
.. note::
This method is limited to simple cartesian geometry. Differentiation along
multidimensional coordinate is not supported.
These methods are limited to simple cartesian geometry. Differentiation
and integration along multidimensional coordinate are not supported.


.. _compute.broadcasting:

Expand Down
35 changes: 23 additions & 12 deletions doc/time-series.rst
Original file line number Diff line number Diff line change
Expand Up @@ -196,11 +196,20 @@ resampling group:
ds.resample(time='6H').reduce(np.mean)
For upsampling, xarray provides four methods: ``asfreq``, ``ffill``, ``bfill``,
and ``interpolate``. ``interpolate`` extends ``scipy.interpolate.interp1d`` and
supports all of its schemes. All of these resampling operations work on both
For upsampling, xarray provides six methods: ``asfreq``, ``ffill``, ``bfill``, ``pad``,
``nearest`` and ``interpolate``. ``interpolate`` extends ``scipy.interpolate.interp1d``
and supports all of its schemes. All of these resampling operations work on both
Dataset and DataArray objects with an arbitrary number of dimensions.

In order to limit the scope of the methods ``ffill``, ``bfill``, ``pad`` and
``nearest`` the ``tolerance`` argument can be set in coordinate units.
Data that has indices outside of the given ``tolerance`` are set to ``NaN``.

.. ipython:: python
ds.resample(time='1H').nearest(tolerance='1H')
For more examples of using grouped operations on a time dimension, see
:ref:`toy weather data`.

Expand Down Expand Up @@ -300,31 +309,34 @@ For data indexed by a :py:class:`~xarray.CFTimeIndex` xarray currently supports:
da.differentiate('time')
- And serialization:
- Serialization:

.. ipython:: python
da.to_netcdf('example-no-leap.nc')
xr.open_dataset('example-no-leap.nc')
- And resampling along the time dimension for data indexed by a :py:class:`~xarray.CFTimeIndex`:

.. ipython:: python
da.resample(time='81T', closed='right', label='right', base=3).mean()
.. note::

While much of the time series functionality that is possible for standard
dates has been implemented for dates from non-standard calendars, there are
still some remaining important features that have yet to be implemented,
for example:

- Resampling along the time dimension for data indexed by a
:py:class:`~xarray.CFTimeIndex` (:issue:`2191`, :issue:`2458`)
- Built-in plotting of data with :py:class:`cftime.datetime` coordinate axes
(:issue:`2164`).

For some use-cases it may still be useful to convert from
a :py:class:`~xarray.CFTimeIndex` to a :py:class:`pandas.DatetimeIndex`,
despite the difference in calendar types (e.g. to allow the use of some
forms of resample with non-standard calendars). The recommended way of
doing this is to use the built-in
:py:meth:`~xarray.CFTimeIndex.to_datetimeindex` method:
despite the difference in calendar types. The recommended way of doing this
is to use the built-in :py:meth:`~xarray.CFTimeIndex.to_datetimeindex`
method:

.. ipython:: python
:okwarning:
Expand All @@ -334,8 +346,7 @@ For data indexed by a :py:class:`~xarray.CFTimeIndex` xarray currently supports:
da
datetimeindex = da.indexes['time'].to_datetimeindex()
da['time'] = datetimeindex
da.resample(time='Y').mean('time')
However in this case one should use caution to only perform operations which
do not depend on differences between dates (e.g. differentiation,
interpolation, or upsampling with resample), as these could introduce subtle
Expand Down
20 changes: 20 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,10 @@ Breaking changes
- Remove support for Python 2. This is the first version of xarray that is
Python 3 only. (:issue:`1876`).
By `Joe Hamman <https://github.com/jhamman>`_.
- The `compat` argument to `Dataset` and the `encoding` argument to
`DataArray` are deprecated and will be removed in a future release.
(:issue:`1188`)
By `Maximilian Roos <https://github.com/max-sixty>`_.

Enhancements
~~~~~~~~~~~~
Expand All @@ -45,6 +49,21 @@ Enhancements
By `Benoit Bovy <https://github.com/benbovy>`_.
- Dataset plotting API! Currently only :py:meth:`Dataset.plot.scatter` is implemented.
By `Yohai Bar Sinai <https://github.com/yohai>`_ and `Deepak Cherian <https://github.com/dcherian>`_
- Resampling of standard and non-standard calendars indexed by
:py:class:`~xarray.CFTimeIndex` is now possible. (:issue:`2191`).
By `Jwen Fai Low <https://github.com/jwenfai>`_ and
`Spencer Clark <https://github.com/spencerkclark>`_.
- Add ``tolerance`` option to ``resample()`` methods ``bfill``, ``pad``,
``nearest``. (:issue:`2695`)
By `Hauke Schulz <https://github.com/observingClouds>`_.
- :py:meth:`~xarray.DataArray.integrate` and
:py:meth:`~xarray.Dataset.integrate` are newly added.
See :ref:`_compute.using_coordinates` for the detail.
(:issue:`1332`)
By `Keisuke Fujii <https://github.com/fujiisoup>`_.
- :py:meth:`pandas.Series.dropna` is now supported for a
:py:class:`pandas.Series` indexed by a :py:class:`~xarray.CFTimeIndex`
(:issue:`2688`). By `Spencer Clark <https://github.com/spencerkclark>`_.

Bug fixes
~~~~~~~~~
Expand Down Expand Up @@ -114,6 +133,7 @@ Breaking changes
(:issue:`2565`). The previous behavior was to decode them only if they
had specific time attributes, now these attributes are copied
automatically from the corresponding time coordinate. This might
break downstream code that was relying on these variables to be
brake downstream code that was relying on these variables to be
not decoded.
By `Fabien Maussion <https://github.com/fmaussion>`_.
Expand Down
5 changes: 3 additions & 2 deletions xarray/backends/netCDF4_.py
Original file line number Diff line number Diff line change
Expand Up @@ -217,8 +217,9 @@ def _extract_nc4_variable_encoding(variable, raise_on_invalid=False,
if raise_on_invalid:
invalid = [k for k in encoding if k not in valid_encodings]
if invalid:
raise ValueError('unexpected encoding parameters for %r backend: '
' %r' % (backend, invalid))
raise ValueError(
'unexpected encoding parameters for %r backend: %r. Valid '
'encodings are: %r' % (backend, invalid, valid_encodings))
else:
for k in list(encoding):
if k not in valid_encodings:
Expand Down
25 changes: 21 additions & 4 deletions xarray/coding/cftime_offsets.py
Original file line number Diff line number Diff line change
Expand Up @@ -358,29 +358,41 @@ def rollback(self, date):
class Day(BaseCFTimeOffset):
_freq = 'D'

def as_timedelta(self):
return timedelta(days=self.n)

def __apply__(self, other):
return other + timedelta(days=self.n)
return other + self.as_timedelta()


class Hour(BaseCFTimeOffset):
_freq = 'H'

def as_timedelta(self):
return timedelta(hours=self.n)

def __apply__(self, other):
return other + timedelta(hours=self.n)
return other + self.as_timedelta()


class Minute(BaseCFTimeOffset):
_freq = 'T'

def as_timedelta(self):
return timedelta(minutes=self.n)

def __apply__(self, other):
return other + timedelta(minutes=self.n)
return other + self.as_timedelta()


class Second(BaseCFTimeOffset):
_freq = 'S'

def as_timedelta(self):
return timedelta(seconds=self.n)

def __apply__(self, other):
return other + timedelta(seconds=self.n)
return other + self.as_timedelta()


_FREQUENCIES = {
Expand Down Expand Up @@ -427,6 +439,11 @@ def __apply__(self, other):
_FREQUENCY_CONDITION)


# pandas defines these offsets as "Tick" objects, which for instance have
# distinct behavior from monthly or longer frequencies in resample.
CFTIME_TICKS = (Day, Hour, Minute, Second)


def to_offset(freq):
"""Convert a frequency string to the appropriate subclass of
BaseCFTimeOffset."""
Expand Down
8 changes: 5 additions & 3 deletions xarray/coding/cftimeindex.py
Original file line number Diff line number Diff line change
Expand Up @@ -335,11 +335,13 @@ def _maybe_cast_slice_bound(self, label, side, kind):
# e.g. series[1:5].
def get_value(self, series, key):
"""Adapted from pandas.tseries.index.DatetimeIndex.get_value"""
if not isinstance(key, slice):
return series.iloc[self.get_loc(key)]
else:
if np.asarray(key).dtype == np.dtype(bool):
return series.iloc[key]
elif isinstance(key, slice):
return series.iloc[self.slice_indexer(
key.start, key.stop, key.step)]
else:
return series.iloc[self.get_loc(key)]

def __contains__(self, key):
"""Adapted from
Expand Down
2 changes: 1 addition & 1 deletion xarray/core/alignment.py
Original file line number Diff line number Diff line change
Expand Up @@ -495,7 +495,7 @@ def _broadcast_array(array):
coords = OrderedDict(array.coords)
coords.update(common_coords)
return DataArray(data, coords, data.dims, name=array.name,
attrs=array.attrs, encoding=array.encoding)
attrs=array.attrs)

def _broadcast_dataset(ds):
data_vars = OrderedDict(
Expand Down
30 changes: 15 additions & 15 deletions xarray/core/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -713,6 +713,13 @@ def resample(self, indexer=None, skipna=None, closed=None, label=None,
array([ 0. , 0.032258, 0.064516, ..., 10.935484, 10.967742, 11. ])
Coordinates:
* time (time) datetime64[ns] 1999-12-15 1999-12-16 1999-12-17 ...
Limit scope of upsampling method
>>> da.resample(time='1D').nearest(tolerance='1D')
<xarray.DataArray (time: 337)>
array([ 0., 0., nan, ..., nan, 11., 11.])
Coordinates:
* time (time) datetime64[ns] 1999-12-15 1999-12-16 ... 2000-11-15
References
----------
Expand Down Expand Up @@ -749,23 +756,16 @@ def resample(self, indexer=None, skipna=None, closed=None, label=None,
dim_coord = self[dim]

if isinstance(self.indexes[dim_name], CFTimeIndex):
raise NotImplementedError(
'Resample is currently not supported along a dimension '
'indexed by a CFTimeIndex. For certain kinds of downsampling '
'it may be possible to work around this by converting your '
'time index to a DatetimeIndex using '
'CFTimeIndex.to_datetimeindex. Use caution when doing this '
'however, because switching to a DatetimeIndex from a '
'CFTimeIndex with a non-standard calendar entails a change '
'in the calendar type, which could lead to subtle and silent '
'errors.'
)

from .resample_cftime import CFTimeGrouper
grouper = CFTimeGrouper(freq, closed, label, base, loffset)
else:
# TODO: to_offset() call required for pandas==0.19.2
grouper = pd.Grouper(freq=freq, closed=closed, label=label,
base=base,
loffset=pd.tseries.frequencies.to_offset(
loffset))
group = DataArray(dim_coord, coords=dim_coord.coords,
dims=dim_coord.dims, name=RESAMPLE_DIM)
# TODO: to_offset() call required for pandas==0.19.2
grouper = pd.Grouper(freq=freq, closed=closed, label=label, base=base,
loffset=pd.tseries.frequencies.to_offset(loffset))
resampler = self._resample_cls(self, group=group, dim=dim_name,
grouper=grouper,
resample_dim=RESAMPLE_DIM)
Expand Down
Loading

0 comments on commit 4e41fc3

Please sign in to comment.