Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

idxmax/idxmin not working with dask arrays of more than 2 dims. #4123

Closed
aulemahal opened this issue Jun 5, 2020 · 0 comments · Fixed by #4135
Closed

idxmax/idxmin not working with dask arrays of more than 2 dims. #4123

aulemahal opened this issue Jun 5, 2020 · 0 comments · Fixed by #4135

Comments

@aulemahal
Copy link
Contributor

In opposition to argmin/argmax, idxmax/idxmin fails on DataArrays of more than 2 dimensions, when the data is stored in dask arrays.

MCVE Code Sample

# Your code here
import xarray as xr
ds = xr.tutorial.open_dataset('air_temperature').resample(time='D').mean()
dsc = ds.chunk({'time':-1, 'lat': 5, 'lon': 5})
dsc.air.argmax('time').values  # Works (I added .values to be sure all computation is done)
dsc.air.idxmin('time') # Fails

Expected Output

Something like:

<xarray.DataArray 'time' (lat: 25, lon: 53)>
dask.array<where, shape=(25, 53), dtype=datetime64[ns], chunksize=(5, 5), chunktype=numpy.ndarray>
Coordinates:
  * lon      (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
  * lat      (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0

Problem Description

Throws an error:

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-11-0b9bf50bc3ab> in <module>
      3 dsc = ds.chunk({'time':-1, 'lat': 5, 'lon': 5})
      4 dsc.air.argmax('time').values
----> 5 dsc.air.idxmin('time')

~/Python/myxarray/xarray/core/dataarray.py in idxmin(self, dim, skipna, fill_value, keep_attrs)
   3626           * y        (y) int64 -1 0 1
   3627         """
-> 3628         return computation._calc_idxminmax(
   3629             array=self,
   3630             func=lambda x, *args, **kwargs: x.argmin(*args, **kwargs),

~/Python/myxarray/xarray/core/computation.py in _calc_idxminmax(array, func, dim, skipna, fill_value, keep_attrs)
   1564         chunks = dict(zip(array.dims, array.chunks))
   1565         dask_coord = dask.array.from_array(array[dim].data, chunks=chunks[dim])
-> 1566         res = indx.copy(data=dask_coord[(indx.data,)])
   1567         # we need to attach back the dim name
   1568         res.name = dim

~/.conda/envs/xarray-xclim-dev/lib/python3.8/site-packages/dask/array/core.py in __getitem__(self, index)
   1539 
   1540         if any(isinstance(i, Array) and i.dtype.kind in "iu" for i in index2):
-> 1541             self, index2 = slice_with_int_dask_array(self, index2)
   1542         if any(isinstance(i, Array) and i.dtype == bool for i in index2):
   1543             self, index2 = slice_with_bool_dask_array(self, index2)

~/.conda/envs/xarray-xclim-dev/lib/python3.8/site-packages/dask/array/slicing.py in slice_with_int_dask_array(x, index)
    934                 out_index.append(slice(None))
    935             else:
--> 936                 raise NotImplementedError(
    937                     "Slicing with dask.array of ints only permitted when "
    938                     "the indexer has zero or one dimensions"

NotImplementedError: Slicing with dask.array of ints only permitted when the indexer has zero or one dimensions

I saw #3922 and thought this PR was aiming to make this work, so I'm a bit confused.

(I tested with dask 2.17.2 also and it still fails)

Versions

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.8.2 | packaged by conda-forge | (default, Apr 24 2020, 08:20:52)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 5.6.15-arch1-1
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: fr_CA.utf8
LOCALE: fr_CA.UTF-8
libhdf5: 1.10.5
libnetcdf: 4.7.4

xarray: 0.15.2.dev9+g6378a711.d20200505
pandas: 1.0.3
numpy: 1.18.4
scipy: 1.4.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: None
cftime: 1.1.1.2
nc_time_axis: 1.2.0
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.16.0
distributed: 2.17.0
matplotlib: 3.2.1
cartopy: None
seaborn: None
numbagg: None
pint: 0.12
setuptools: 46.1.3.post20200325
pip: 20.0.2
conda: None
pytest: 5.4.1
IPython: 7.13.0
sphinx: 3.0.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant