WIP: Optional indexes (no more default coordinates given by range(n)) #1017

shoyer · 2016-09-24T21:24:39Z

Fixes #283

Motivation

Currently, when a Dataset or DataArray is created without explicit coordinate labels for a dimension, we insert a coordinate with the values given by range(n).

This is problematic, for two main reasons:

There aren't always meaningful dimension labels. For example, an RGB image might represented by a DataArray with three dimensions ('row', 'column', 'channel'). 'row' and 'column' each have fixed size, but only channel has meaningful labels ['red', 'green', 'blue'].
Default labels lead to bad default alignment behavior. In the RGB image example, when I combine a 200x200 pixel image with a 300x300 pixel image, xarray would currently align rows and columns into a 200x200 image. This isn't desirable -- you'd rather get an error than use default labels for alignment.

As is, xarray isn't a good fit for users who don't have meaningful coordinate labels over one or more dimensions. So making labels optional would also increase the audience for the project.

Design decisions

In general, I have followed the alignment rules I suggested for pandas in wesm/pandas2#17, but there are still some xarray specific design decisions to resolve:

~~How to handle stack(z=['x', 'y']) when one or more of the original dimensions do not have labels. If we don't insert dummy indexes for MultiIndex levels, then we can't unstack properly anymore.~~ Decision: insert dummy dimensions with stack() as necessary.
How to handle missing indexes in .sel, e.g., array.sel(x=0) when x is not in array.coords. In the current version of this PR, this errors, but my current inclination is to pass .sel indexers directly on to .isel, without remapping labels. This has the bonus of preserving the current behavior for indexing. Decision: if a dimension does not have an associated coordinate, indexing along that dimension with .sel or .loc is positional (like .isel).
Should we create dummy/virtual coordinates like range(n) on demand when indexing a dimension without labels? e.g., array.coords['x'] would return a DataArray with values range(n) (importantly, this would not change the original array). Decision: yes, this the maximally backwards compatible thing to do.
~~What should the new behavior for reindex_like be, if the argument has dimensions of different sizes but no dimension labels? Should we raise an error, or simply ignore these dimensions?~~ Decision: users expect reindex_like to work like align. We will raise an error if dimension sizes do not match.
What does the transition to this behavior look like? Do we simply add it to v0.9.0 with a big warning about backwards compatibility, or do we need to go through some sort of deprecation cycle with the current behavior? Decision: not doing a deprecation cycle, that would be too cumbersome

Examples of new behavior

In [1]: import xarray as xr

In [2]: a = xr.DataArray([1, 2, 3], dims='x')

In [3]: b = xr.DataArray([[1, 2], [3, 4], [5, 6]], dims=['x', 'y'], coords={'y': ['a', 'b']})

In [4]: a
Out[4]:
<xarray.DataArray (x: 3)>
array([1, 2, 3])

In [5]: b
Out[5]:
<xarray.DataArray (x: 3, y: 2)>
array([[1, 2],
       [3, 4],
       [5, 6]])
Coordinates:
  * y        (y) <U1 'a' 'b'

In [6]: a + b
Out[6]:
<xarray.DataArray (x: 3, y: 2)>
array([[2, 3],
       [5, 6],
       [8, 9]])
Coordinates:
  * y        (y) <U1 'a' 'b'

In [7]: c = xr.DataArray([1, 2], dims='x')

In [8]: a + c
ValueError: dimension 'x' without indexes cannot be aligned because it has different sizes: {2, 3}

In [9]: d = xr.DataArray([1, 2, 3], coords={'x': [10, 20, 30]}, dims='x')

# indexes are copied from the argument with labels if they have the same size
In [10]: a + d
Out[10]:
<xarray.DataArray (x: 3)>
array([2, 4, 6])
Coordinates:
  * x        (x) int64 10 20 30

New doc sections

fmaussion · 2016-09-25T12:31:05Z

I have no use case for this functionality, so I have no strong opinion about it. Two questions though:

why make it the new default instead of using a keyword (something like no_index=True)?
without coordinates the majority of xarray's functions are not working anymore. So what will xarray have that numpy doesn't have already? (the only thing I could think of is labeled dimensions, but there are probably more use cases?)

shoyer · 2016-09-25T17:51:45Z

why make it the new default instead of using a keyword (something like no_index=True)?

Basically, this is about the user experience and first impressions.

There are very few cases when somebody would prefer a default index to no index at all, so I see few cases for no_index=False in the long term. It seems cleaner to simply spell this as coords={'x': range(n)}.

From the experience of new users, it's really nice to be able to incrementally try out features from a new library. Seeing extra information appear in the data model that they didn't add makes people (rightfully) nervous, because they don't know how it will work yet.

without coordinates the majority of xarray's functions are not working anymore. So what will xarray have that numpy doesn't have already? (the only thing I could think of is labeled dimensions, but there are probably more use cases?)

Labeled dimensions without coordinate labels actually get you plenty. You get better broadcasting, aggregation (e.g., .sum('x')) and even indexing (e.g., .isel(x=0)).

But the big advantage is the ability to cleanly mix dimensions with and without indexes on the same objects, which covers several more use cases for labeled arrays. Examples off hand include images (see the example from my first post) and machine learning models (where columns usually have labels corresponding to features but rows often are simply unlabeled examples).

gdementen · 2016-09-26T07:31:25Z

FWIW, I solved this issue in a slightly different way in my own labelled array project (https://github.com/liam2/larray) (that I still hope to one day merge with xarray -- I will probably need to rewrite my project on top of xarray because the ship as sailed concerning the user-facing API): by default, you get "wildcard" axes, which only have a size and no labels (they do get a range() labels on demand, so you can .sel on that dimension -- to speak in xarray vocabulary). Those wildcard labels are not as picky as normal labels: a wildcard axis compares equal/aligns to other axes as long as it has the same length. In practice, I guess it will be very similar to not having an index at all (and it is probably cleaner this way, but I didn't think of that at the time). All of this to say that yes, this PR is definitely a good idea and would make xarray useful in more situations, as I have hit a lot of cases where real range() labels like you have now made things a lot more painful than necessary.

The only advantage I can think of now (except it was easier for me to implement it this way) of having a "wildcard axis" instead of no index/labels at all is that a subset could keep the information about which "tick" it comes from (again without blocking alignment). Not sure it's worth it though (I have actually not implemented it this way yet, but I was contemplating doing so).

shoyer · 2016-09-26T16:17:28Z

@gdementen Thanks for chiming in! Yes, in practice I think "no index" for xarray will work almost exactly the same as your "wildcard index".

The only advantage I can think of now (except it was easier for me to implement it this way) of having a "wildcard axis" instead of no index/labels at all is that a subset could keep the information about which "tick" it comes from (again without blocking alignment). Not sure it's worth it though (I have actually not implemented it this way yet, but I was contemplating doing so).

I'm not a fan of this one. It's a perpetual minor annoyance with pandas to subset a meaningless range(n) index only to get non-sensical tick labels. Also, keeping track of tick labels but not using them for alignment makes it less obvious that a dimension doesn't have an index.

gdementen · 2016-09-27T07:20:04Z

@shoyer Honestly, I have not thought that part (keep the tick labels for subsets) through (since I did not encounter an actual use case for that yet), I was more or less dumping my thought process in case you can make something useful out of it :). Nevertheless, those labels would not necessarily be non-sensical. In your image example above, it seems to me that knowing that the region of the image you are using comes (for example) from the center-right of the original image could be useful information. As for conveying that the dimension is special, I use a "*" next to the axis name to convey that it is a wildcard axis. It seems to go well with my current users.

rabernat · 2016-09-27T13:39:29Z

How does one select / slice a dataarray with no index? Does isel work but not sel?

shoyer · 2016-09-27T15:54:15Z

How does one select / slice a dataarray with no index? Does isel work but not sel?

isel will definitely work unchanged.

For .sel, there are two choices:

Raise a TypeError about how an index is required (this would be the strictest option)
Just pass on the key arguments for dimension without an index unchanged on to .isel (this is the approach I currently prefer, because it's a bit more convenient and also preserves backwards compatibility)

rabernat · 2016-09-27T19:51:32Z

Just pass on the key arguments for dimension without an index unchanged on to .isel

That sounds like the right way to go.

benbovy · 2016-10-14T00:58:46Z

Should we create dummy/virtual coordinates like range(n) on demand when indexing a dimension without labels?

I think about some possible use cases where this behavior - if I understand it well - may not be desired. For example, if we want to compute partial derivatives by finite difference, using xarray would not give the expected result (as numpy does):

>>> z = np.random.rand(5, 5)
>>> dz_dx_numpy = z[1:, :] - z[:-1, :]    # forward difference
>>> dz_dx_numpy
array([[-0.16122906, -0.73841927,  0.11565084,  0.94529388,  0.04245993],
       [ 0.21066572,  0.11964058, -0.11203992, -0.52073269, -0.50827324],
       [-0.42100012,  0.39873985,  0.07957889, -0.02071456,  0.59944732],
       [-0.53819024, -0.29738881,  0.35238292,  0.01903809,  0.15671588]])

>>> da = xr.DataArray(z, dims=('x', 'y'), name='z')
>>> dz_dx_xarray = da[1:, :] - da[:-1, :]     # forward difference
>>> dz_dx_xarray
<xarray.DataArray 'z' (x: 3, y: 5)>
array([[ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.]])
Coordinates:
  * x        (x) int64 1 2 3

However, I guess that this specific kind of problem should rather be addressed using the upcoming logic for applying vectorized functions to xarray objects (#964).

e.g., array.coords['x'] would return a DataArray with values range(n) (importantly, this would not change the original array).

That sounds a bit weird to me (I'm not sure to understand, actually). What are the reasons/benefits of returning a DataArray instead of raising a KeyError?

shoyer · 2016-10-14T01:13:30Z

@benbovy This is actually a good use case for no dimension labels. E.g., from my working branch:

In [7]: da = xr.DataArray(z, dims=('x', 'y'), name='z')

In [8]: dz_dx_xarray = da[1:, :] - da[:-1, :]

In [9]: dz_dx
dz_dx_numpy   dz_dx_xarray

In [9]: dz_dx_xarray
Out[9]:
<xarray.DataArray 'z' (x: 4, y: 5)>
array([[ 0.15224392, -0.03428312, -0.10936435,  0.06149288, -0.69317859],
       [-0.61928605,  0.71636887, -0.05578677, -0.39489466,  0.63472963],
       [ 0.05180684, -0.72471438,  0.64259117,  0.24830877, -0.24006862],
       [ 0.44981358,  0.19054462, -0.69880118, -0.20120161,  0.08580928]])

In [10]: da.diff('x')
Out[10]:
<xarray.DataArray 'z' (x: 4, y: 5)>
array([[ 0.15224392, -0.03428312, -0.10936435,  0.06149288, -0.69317859],
       [-0.61928605,  0.71636887, -0.05578677, -0.39489466,  0.63472963],
       [ 0.05180684, -0.72471438,  0.64259117,  0.24830877, -0.24006862],
       [ 0.44981358,  0.19054462, -0.69880118, -0.20120161,  0.08580928]])

This does depend on the details of how .diff is implemented though. It does a loop over variables.items() instead of explicitly accessing variables corresponding to dimensions.

e.g., array.coords['x'] would return a DataArray with values range(n) (importantly, this would not change the original array).
That sounds a bit weird to me (I'm not sure to understand, actually). What are the reasons/benefits of returning a DataArray instead of raising a KeyError?

The (theoretical) benefit would be an easier transition, because previously ds[dim] always worked.

As I'm working through porting xarray's test suite, I'm realizing that this may not be the best approach. If a user is relying on ds[dim] or array.coords[dim] always working, they are probably not going to be happy with inadvertently adding new coordinates like range(n) (which implies new alignment semantics). It might be a better idea to force a hard transition.

Either way, the functionality in your reset_index() PR is going to be really essential, because these range(n) indexes are going to show up inadvertently sometimes, such as converting pandas objects or .unstack().

benbovy · 2016-10-14T12:55:21Z

Oops, I missed .diff() that already solves my given example. Anyway, I can vaguely see other cases (e.g., map algebra) where we want to compare or combine image or raster subsets (i.e., after some indexing) that have the same shape but that don't overlap or that come from sources with different sizes. That can be an issue if we create dummy coordinates when indexing, though reset_index() would help and would certainly be enough here.

shoyer · 2016-10-17T15:40:00Z

New design question:

What should the new behavior for reindex_like be, if the argument has dimensions of different sizes but no dimension labels? Should we raise an error, or simply ignore these dimensions? e.g.,

array1 = xr.DataArray([1, 2], dims='x')
array2 = xr.DataArray([3, 4, 5], dims='x')
array1.reindex_like(array2)

array2.indexes will be an empty dict, which, given that reindex_like is basically an alias for .reindex(**other.indexes) suggests that we shouldn't raise an error. But align does currently align an error in such cases, unless the dimension is explicitly excluded with exclude.

gdementen · 2016-10-19T06:57:23Z

I have never encountered this case yet but ignoring that dimension seems like a bad idea to me. When I use a.reindex_like(b), I usually mean "make a compatible with b", so I assume the resulting index and shape is the same than (or at least compatible with) b. More importantly, I expect to be able to do binary operations between the re-indexed a and b. Ignoring the index like you propose would break that.

Given that, I would go with either an error, or treat the missing index like it was range() in this case, ie fill the extra values with NAN. I think I would have a slight preference for the later in my own code (wildcard axes), but in xarray I am unsure. The error might be more coherent.

shoyer · 2016-10-19T14:45:25Z

I have never encountered this case yet but ignoring that dimension seems like a bad idea to me. When I use a.reindex_like(b), I usually mean "make a compatible with b", so I assume the resulting index and shape is the same than (or at least compatible with) b. More importantly, I expect to be able to do binary operations between the re-indexed a and b.

@gdementen thanks for the input. I am inclined to agree. Even within the xarray codebase, we basically use x = y.reindex_like(z) as cleaner spelling for x, _ = align(y, z, join='left').

Given that, I would go with either an error, or treat the missing index like it was range() in this case, ie fill the extra values with NAN.

I think we want the error here, given that this is one of the major motivations for allowing missing coordinate labels (not assuming invalid labels).

gdementen · 2016-10-20T07:28:24Z

I think we want the error here, given that this is one of the major motivations for allowing missing coordinate labels (not assuming invalid labels).

Yes. But, in that case you need a way to do the "fill with NAN" option without having to jump through too many hoops. How would you do that?

benbovy · 2016-10-20T09:15:34Z

+1 for the align error.

Yes. But, in that case you need a way to do the "fill with NAN" option without having to jump through too many hoops. How would you do that?

Using .set_index() (#1028) before .reindex_like() seems fine in that case. It's not much longer and it's more explicit.

shoyer · 2016-11-07T05:50:06Z

This is ready for review. (The failing test is unrelated -- a regression in dask.array.)

shoyer · 2016-11-12T06:04:14Z

I've gone through the docs updated every mention I could find of default indexes. So, I think this really is nearly ready to merge now -- review would be highly appreciated. I've added renderings of a few choice sections of the docs to the top post.

The last remaining design decision is how to handle the transition. I would really prefer to avoid a deprecation cycle involving issuing warnings in new users' first encounter with xarray. This means that dependent libraries will need to be updated if this causes them to break (which I think is likely).

My best idea is to issue a "beta" release, write a mailing list post and give people lots of time to test and update their packages (a few weeks to a month).

crusaderky · 2016-11-15T21:39:49Z

I've gone through it and it works great.
A couple of very minor grievances:

1
Could you change repr to highlight the dims without coords? It's very easy not to notice them as they exclusively appear in the list on top!

e.g. change this

<xarray.DataArray (x: 5, y: 3, z: 7)>
dask.array<xarray-..., shape=(5, 3, 7), dtype=float64, chunksize=(5, 3, 7)>
Coordinates:
  * y        (y) |S1 'a' 'b' 'c'
  * x        (x) int64 1 2 3 4 5

to:

<xarray.DataArray (x: 5, y: 3, z: 7)>
dask.array<xarray-..., shape=(5, 3, 7), dtype=float64, chunksize=(5, 3, 7)>
Coordinates:
  * y        (y) |S1 'a' 'b' 'c'
  * x        (x) int64 1 2 3 4 5
  * z        (z) -

2
Could you change DataArray.drop() and all other similar functions to silently do nothing when you try to drop something that is in the dims but not in the coords?
This caused breakages in my code BTW, as it was assuming that a dim always had a matching coord. Specifically, the code that broke was:

v2 = v1.sel(somedim=somevalue).drop('somedim')

I had to change it to:

v2 = v1.sel(somedim=somevalue)
if 'somedim' in v2.coords:
    v2 = v2.drop('somedim')

which is annoyingly ugly.
Silently skipping the missing coord is nicer, and makes a lot of sense. (you will still crash if there's no such dim though).

benbovy · 2016-11-15T22:21:58Z

I also understand (1), but renaming Coordinates to Dimensions would be a bit odd too, especially in case multiple coordinates for a given dimension or multi-dimensional coordinates.

Another suggestion may be to "highlight" in the top list the dimensions which have an index, using the same symbol than in the coordinate list:

<xarray.DataArray (*x: 5, *y: 3, z: 7)>
dask.array<xarray-..., shape=(5, 3, 7), dtype=float64, chunksize=(5, 3, 7)>
Coordinates:
  * y        (y) |S1 'a' 'b' 'c'
  * x        (x) int64 1 2 3 4 5

shoyer · 2016-11-15T23:20:53Z

@crusaderky thanks for giving this a try!

RE: missing dimension in the repr

I like the idea of mirroring * from the coordinate list in the list of dimensions.

I'm less sure about adding markers for empty dimensions to coordinates. That makes for a much longer repr for some simple examples, e.g.,

<xarray.DataArray (x: 3, y: 4)>
array([[ 0.8167696 ,  0.15151986,  0.81139993,  0.33878428],
       [ 0.96861902,  0.34231084,  0.55831466,  0.92723981],
       [ 0.16737575,  0.32391949,  0.39093643,  0.64267858]])

vs

<xarray.DataArray (x: 3, y: 4)>
array([[ 0.8167696 ,  0.15151986,  0.81139993,  0.33878428],
       [ 0.96861902,  0.34231084,  0.55831466,  0.92723981],
       [ 0.16737575,  0.32391949,  0.39093643,  0.64267858]])
Coordinates:
  * x        (x) -
  * y        (y) -

RE: drop

I understand the annoyance of v1.sel(somedim=somevalue).drop('somedim') no longer working. Unfortunately, I don't think we can make drop handle former dimensions differently, because after v1.sel(somedim=somevalue) we no longer have any representation of the fact that the dimension was indexed in the xarray data model.

What we could do is add a separate discard method (sort of like set.discard), which works like drop but ignores missing labels instead of raising an error.

crusaderky · 2016-11-16T00:38:43Z

@shoyer:
maybe you could print the dummy coord (as in my example) if there's one or more real coord, and don't print the coords block at all if there isn't any (as in your example)?
The problem of readability only happens when there's some coords - so one needs to go look at the dims and notice that there's more than meets the eye. When there's no coords at all, the only place to look at is the dims, so I think it's fairly readable.

shoyer · 2016-11-19T02:50:12Z

I tried adding * to the repr for dimension names with coordinates. This doesn't work as well for Dataset, because the column with the dimensions no longer lines up:

        <xarray.Dataset>
        Dimensions:  (*dim2: 9, *dim3: 10, *time: 20)
        Coordinates:
          * time     (time) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 ...
          * dim2     (dim2) float64 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
          * dim3     (dim3) %s 'a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j'
            numbers  (dim3) int64 0 1 2 0 0 1 1 2 2 3
        Data variables:
            var3     (dim3, dim1) float64 0.5565 -0.2121 0.4563 1.545 -0.2397 0.1433 ...

shoyer · 2016-11-19T03:07:40Z

drop in pandas has a keyword argument errors='raise' (or errors='ignore') that lets it switch to work like discard. A separate discard method feels cleaner to me, but I can understand an argument for consistency with pandas. Thoughts?

max-sixty · 2016-11-19T05:55:18Z

drop in pandas has a keyword argument errors='raise' (or errors='ignore') that lets it switch to work like discard. A separate discard method feels cleaner to me, but I can understand an argument for consistency with pandas. Thoughts?

In python sets have a remove and a discard - the first raises if not found, the second doesn't.
Any thoughts on being consistent with that? Probably not worth the change cost tbh.

Overall no strong view; I'm probably a weak vote against adding discard without changing drop more thoroughly.

shoyer · 2016-11-20T02:29:20Z

Another option: finally add a boolean drop keyword argument to isel/sel/squeeze (#242). Then the original example becomes v1.sel(somedim=somevalue, drop=True), which we can make work regardless of whether or not a coordinate value for somedim exists.

Fixes GH242 This is useful for getting rid of extraneous scalar variables that arise from indexing, and in particular will resolve an issue for optional indexes: pydata#1017 (comment)

shoyer · 2016-12-05T11:02:34Z

Another option: finally add a boolean drop keyword argument to isel/sel/squeeze (#242). Then the original example becomes v1.sel(somedim=somevalue, drop=True), which we can make work regardless of whether or not a coordinate value for somedim exists.

See #1153

shoyer · 2016-12-08T10:07:02Z

maybe you could print the dummy coord (as in my example) if there's one or more real coord, and don't print the coords block at all if there isn't any (as in your example)?
The problem of readability only happens when there's some coords - so one needs to go look at the dims and notice that there's more than meets the eye. When there's no coords at all, the only place to look at is the dims, so I think it's fairly readable.

Done -- missing coordinates are marked by - for the dtype/values, as long as there is at least one.

Are there any other concerns before I merge this?

benbovy · 2016-12-09T10:57:06Z

The only concern I have for the repr changes is using the symbol * for missing coordinates. To me * really means that it is an index. Maybe just remove it? e.g.,

Coordinates:
  * y        (y) |S1 'a' 'b' 'c'
  * x        (x) int64 1 2 3 4 5
    z        (z) -

or use another symbol like o that means 'no index'?

Coordinates:
  * y        (y) |S1 'a' 'b' 'c'
  * x        (x) int64 1 2 3 4 5
  o z        (z) -

and/or even remove the coordinate name (because there is no coordinate)

Coordinates:
  * y        (y) |S1 'a' 'b' 'c'
  * x        (x) int64 1 2 3 4 5
  o          (z) -

Coordinates:
  * y        (y) |S1 'a' 'b' 'c'
  * x        (x) int64 1 2 3 4 5
             (z) -

shoyer · 2016-12-09T13:36:12Z

or use another symbol like o that means 'no index'?

This seems like the best alternative to me. I don't like omitting the variable name because it seems that it might fall under the previous row, like a level in the MultiIndex repr.

shoyer · 2016-12-15T02:08:42Z

or use another symbol like o that means 'no index'?

This seems like the best alternative to me. I don't like omitting the variable name because it seems that it might fall under the previous row, like a level in the MultiIndex repr.

Done. Any further concerns? I'd really like to merge this and then get the 0.9 release out shortly after.

crusaderky · 2016-12-15T02:30:24Z

Go on :)

…

On 15 Dec 2016 02:08, "Stephan Hoyer" ***@***.***> wrote: or use another symbol like o that means 'no index'? This seems like the best alternative to me. I don't like omitting the variable name because it seems that it might fall under the previous row, like a level in the MultiIndex repr. Done. Any further concerns? I'd really like to merge this and then get the 0.9 release out shortly after. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1017 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AF7OME8iMlABXbPlTkQVPRS9PG4e4_Sgks5rIKErgaJpZM4KFv_D> .

shoyer · 2016-12-15T02:40:42Z

OK, in it goes!

* Add drop=True argument to isel, sel and squeeze Fixes GH242 This is useful for getting rid of extraneous scalar variables that arise from indexing, and in particular will resolve an issue for optional indexes: #1017 (comment) * More tests for Dataset.squeeze(drop=True) * Add two more tests, for drop=True without coords

shoyer · 2017-01-15T03:09:05Z

@fmaussion has raised some concerns about the new repr in #1199

shoyer mentioned this pull request Sep 26, 2016

Optional indexes wesm/pandas2#17

Open

shoyer mentioned this pull request Oct 3, 2016

Add set_index, reset_index and reorder_levels methods #1028

Merged

shoyer mentioned this pull request Oct 14, 2016

to_dataset lossy #1047

Closed

This was referenced Oct 22, 2016

Inconsistency between the types of Dataset.dims and DataArray.dims #921

Closed

shallow copies become deep copies when pickling #1058

Closed

shoyer mentioned this pull request Nov 4, 2016

MultiIndex serialization to NetCDF #1077

Closed

shoyer force-pushed the optional-indexes branch from 6a0a1f4 to fefd741 Compare November 5, 2016 04:44

Indexes are now optional

eb70506

shoyer force-pushed the optional-indexes branch from a1a2dbf to eb70506 Compare November 12, 2016 05:50

add issue link on optional-indexes to what's new

8226d12

Fix test failure on windows

554f007

use shared dimension summary in formatting.py

cfca6e5

shoyer mentioned this pull request Dec 5, 2016

Add drop=True argument to isel, sel and squeeze #1153

Merged

shoyer added 2 commits December 8, 2016 10:59

missing coordinates appear in the repr

4e0319d

Merge branch 'master' into optional-indexes

905fb63

shoyer added 2 commits December 14, 2016 17:57

Mark missing coords with "o" in the repr

fd4ae6c

Merge branch 'master' into optional-indexes

a8dab93

shoyer merged commit 7084df0 into pydata:master Dec 15, 2016

jhamman mentioned this pull request Dec 29, 2016

All dimmensions become coordinates? #722

Closed

fmaussion mentioned this pull request Jan 11, 2017

Document the new __repr__ #1199

Closed

fmaussion mentioned this pull request Jan 19, 2017

to_netcdf() fails to append to an existing file #1215

Closed

fmaussion mentioned this pull request Oct 23, 2017

Low memory/out-of-core index? #1650

Open

spencerkclark mentioned this pull request Jan 2, 2020

Feature/coarsen on pressure ai2cm/fv3net#87

Merged

mroeschke mentioned this pull request Oct 3, 2022

API Should Index be made opt-in? pandas-dev/pandas#48880

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Optional indexes (no more default coordinates given by range(n)) #1017

WIP: Optional indexes (no more default coordinates given by range(n)) #1017

shoyer commented Sep 24, 2016 •

edited

Loading

fmaussion commented Sep 25, 2016

shoyer commented Sep 25, 2016 •

edited

Loading

gdementen commented Sep 26, 2016

shoyer commented Sep 26, 2016 •

edited

Loading

gdementen commented Sep 27, 2016 •

edited

Loading

rabernat commented Sep 27, 2016

shoyer commented Sep 27, 2016

rabernat commented Sep 27, 2016

benbovy commented Oct 14, 2016 •

edited by shoyer

Loading

shoyer commented Oct 14, 2016

benbovy commented Oct 14, 2016 •

edited

Loading

shoyer commented Oct 17, 2016

gdementen commented Oct 19, 2016

shoyer commented Oct 19, 2016

gdementen commented Oct 20, 2016

benbovy commented Oct 20, 2016

shoyer commented Nov 7, 2016 •

edited

Loading

shoyer commented Nov 12, 2016

crusaderky commented Nov 15, 2016

benbovy commented Nov 15, 2016

shoyer commented Nov 15, 2016 •

edited

Loading

crusaderky commented Nov 16, 2016

shoyer commented Nov 19, 2016 •

edited

Loading

shoyer commented Nov 19, 2016

max-sixty commented Nov 19, 2016

shoyer commented Nov 20, 2016

shoyer commented Dec 5, 2016

shoyer commented Dec 8, 2016

benbovy commented Dec 9, 2016

shoyer commented Dec 9, 2016

shoyer commented Dec 15, 2016 •

edited

Loading

crusaderky commented Dec 15, 2016 via email

shoyer commented Dec 15, 2016

shoyer commented Jan 15, 2017

WIP: Optional indexes (no more default coordinates given by range(n)) #1017

WIP: Optional indexes (no more default coordinates given by range(n)) #1017

Conversation

shoyer commented Sep 24, 2016 • edited Loading

Motivation

Design decisions

Examples of new behavior

New doc sections

fmaussion commented Sep 25, 2016

shoyer commented Sep 25, 2016 • edited Loading

gdementen commented Sep 26, 2016

shoyer commented Sep 26, 2016 • edited Loading

gdementen commented Sep 27, 2016 • edited Loading

rabernat commented Sep 27, 2016

shoyer commented Sep 27, 2016

rabernat commented Sep 27, 2016

benbovy commented Oct 14, 2016 • edited by shoyer Loading

shoyer commented Oct 14, 2016

benbovy commented Oct 14, 2016 • edited Loading

shoyer commented Oct 17, 2016

gdementen commented Oct 19, 2016

shoyer commented Oct 19, 2016

gdementen commented Oct 20, 2016

benbovy commented Oct 20, 2016

shoyer commented Nov 7, 2016 • edited Loading

shoyer commented Nov 12, 2016

crusaderky commented Nov 15, 2016

benbovy commented Nov 15, 2016

shoyer commented Nov 15, 2016 • edited Loading

crusaderky commented Nov 16, 2016

shoyer commented Nov 19, 2016 • edited Loading

shoyer commented Nov 19, 2016

max-sixty commented Nov 19, 2016

shoyer commented Nov 20, 2016

shoyer commented Dec 5, 2016

shoyer commented Dec 8, 2016

benbovy commented Dec 9, 2016

shoyer commented Dec 9, 2016

shoyer commented Dec 15, 2016 • edited Loading

crusaderky commented Dec 15, 2016 via email

shoyer commented Dec 15, 2016

shoyer commented Jan 15, 2017

shoyer commented Sep 24, 2016 •

edited

Loading

shoyer commented Sep 25, 2016 •

edited

Loading

shoyer commented Sep 26, 2016 •

edited

Loading

gdementen commented Sep 27, 2016 •

edited

Loading

benbovy commented Oct 14, 2016 •

edited by shoyer

Loading

benbovy commented Oct 14, 2016 •

edited

Loading

shoyer commented Nov 7, 2016 •

edited

Loading

shoyer commented Nov 15, 2016 •

edited

Loading

shoyer commented Nov 19, 2016 •

edited

Loading

shoyer commented Dec 15, 2016 •

edited

Loading