ensure rename does not change index type #3532

mathause · 2019-11-14T14:12:26Z

Closes CFTimeIndex changes to normal Index after renaming #3522
Tests added
Passes black . && mypy . && flake8
Fully documented, including whats-new.rst for all changes and api.rst for new API

mathause · 2019-11-14T14:16:37Z

xarray/core/dataset.py

+                index = v
+                index.name = new_name


Alternatively there is a index = v._shallow_copy(name=new_name) method. It is "an internal non-public method" in pandas. Should I use it anyway?

While I think it's fairly stable, you're right—best practice is indeed to avoid.

I think the current code will mutate the existing index name, is that correct? We should at least test to ensure that doesn't happen. I think .rename rather than direct assignment will avoid this (and calls _shallow_copy from within pandas so we're safe)

FWIW I think we can replace https://github.com/pydata/xarray/pull/3532/files#diff-921db548d18a549f6381818ed08298c9R2662-R2667 also with .rename, but it's minor and doesn't need to be in this PR

Oh, there is even rename - I did not see that, will switch to it.

mathause · 2019-11-14T16:05:59Z

FWIW I think we can replace https://github.com/pydata/xarray/pull/3532/files#diff-921db548d18a549f6381818ed08298c9R2662-R2667 also with .rename, but it's minor and doesn't need to be in this PR

I have to look how this is done in the MultiIndex.

mathause · 2019-11-14T16:11:28Z

I think the current code will mutate the existing index name, is that correct?

Currently the index is recreated with the same name (even if it does not change it's name). We could change the following lines

xarray/xarray/core/dataset.py

Lines 2649 to 2651 in 40588dc

    
           new_name = name_dict.get(k, k) 
        
           if new_name not in dims_set: 
        
               continue

 new_name = name_dict.get(k, None) 
 if new_name is None or new_name not in dims_set: 
     continue

to skip doing this (not sure if there is an edge cas I am missing here).

mathause · 2019-11-14T16:17:13Z

No, this probably does not work if only one of the names of a MultiIndex changes...

max-sixty · 2019-11-14T17:22:39Z

Currently the index is recreated with the same name (even if it does not change it's name). We could change the following lines

I was unclear when I wrote "current code"; I was referring to proposed code in this PR. I don't think the code on master mutates the existing index, because it's making a copy. But I think this does:

                index = v
                index.name = new_name

mathause · 2019-11-15T14:14:33Z

Thanks, yes my approach was indeed wrong. I updated the code and the tests. I was confused for a while because orig.rename_dims(time="time_new") drops indexes while orig.rename_dims() does not.

max-sixty

Super! Any other comments before merge?

mathause · 2019-11-15T14:49:26Z

Fine for me. I think MultiIndex could be treated the same. However, it does conserve CFTimeIndex already, and the test would require a bit of thinking, so let's leave that for now.

max-sixty · 2019-11-15T19:49:48Z

Perfect, thanks again @mathause

* upstream/master: Added fill_value for unstack (pydata#3541) Add DatasetGroupBy.quantile (pydata#3527) ensure rename does not change index type (pydata#3532) Leave empty slot when not using accessors interpolate_na: Add max_gap support. (pydata#3302) units & deprecation merge (pydata#3530) Fix set_index when an existing dimension becomes a level (pydata#3520) add Variable._replace (pydata#3528) Tests for module-level functions with units (pydata#3493) Harmonize `FillValue` and `missing_value` during encoding and decoding steps (pydata#3502) FUNDING.yml (pydata#3523) Allow appending datetime & boolean variables to zarr stores (pydata#3504) warn if dim is passed to rolling operations. (pydata#3513) Deprecate allow_lazy (pydata#3435) Recursive tokenization (pydata#3515)

* upstream/master: (22 commits) Added fill_value for unstack (pydata#3541) Add DatasetGroupBy.quantile (pydata#3527) ensure rename does not change index type (pydata#3532) Leave empty slot when not using accessors interpolate_na: Add max_gap support. (pydata#3302) units & deprecation merge (pydata#3530) Fix set_index when an existing dimension becomes a level (pydata#3520) add Variable._replace (pydata#3528) Tests for module-level functions with units (pydata#3493) Harmonize `FillValue` and `missing_value` during encoding and decoding steps (pydata#3502) FUNDING.yml (pydata#3523) Allow appending datetime & boolean variables to zarr stores (pydata#3504) warn if dim is passed to rolling operations. (pydata#3513) Deprecate allow_lazy (pydata#3435) Recursive tokenization (pydata#3515) format indexing.rst code with black (pydata#3511) add missing pint integration tests (pydata#3508) DOC: update bottleneck repo url (pydata#3507) add drop_sel, drop_vars, map to api.rst (pydata#3506) remove syntax warning (pydata#3505) ...

* master: (24 commits) Tweaks to release instructions (pydata#3555) Clarify conda environments for new contributors (pydata#3551) Revert to dev version 0.14.1 whatsnew (pydata#3547) sparse option to reindex and unstack (pydata#3542) Silence sphinx warnings (pydata#3516) Numpy 1.18 support (pydata#3537) tweak whats-new. (pydata#3540) small simplification of rename from pydata#3532 (pydata#3539) Added fill_value for unstack (pydata#3541) Add DatasetGroupBy.quantile (pydata#3527) ensure rename does not change index type (pydata#3532) Leave empty slot when not using accessors interpolate_na: Add max_gap support. (pydata#3302) units & deprecation merge (pydata#3530) Fix set_index when an existing dimension becomes a level (pydata#3520) add Variable._replace (pydata#3528) Tests for module-level functions with units (pydata#3493) Harmonize `FillValue` and `missing_value` during encoding and decoding steps (pydata#3502) FUNDING.yml (pydata#3523) ...

ensure rename does not change index type

3e3f5ec

mathause commented Nov 14, 2019

View reviewed changes

test requires cftime

5064cd2

mathause added 2 commits November 15, 2019 15:06

test orig.indexes[time].name is conserved

3b85495

use index.rename()

ec76154

max-sixty approved these changes Nov 15, 2019

View reviewed changes

max-sixty merged commit 68b004f into pydata:master Nov 15, 2019

max-sixty added a commit to max-sixty/xarray that referenced this pull request Nov 16, 2019

small simplification of rename from pydata#3532

32d1ad9

max-sixty mentioned this pull request Nov 16, 2019

small simplification of rename from #3532 #3539

Merged

1 task

dcherian pushed a commit that referenced this pull request Nov 16, 2019

small simplification of rename from #3532 (#3539)

9755e3f

mathause deleted the fix/rename_changes_index branch August 19, 2020 13:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ensure rename does not change index type #3532

ensure rename does not change index type #3532

mathause commented Nov 14, 2019

mathause Nov 14, 2019 •

edited

Loading

max-sixty Nov 14, 2019

mathause Nov 14, 2019

mathause commented Nov 14, 2019

mathause commented Nov 14, 2019 •

edited

Loading

mathause commented Nov 14, 2019

max-sixty commented Nov 14, 2019

mathause commented Nov 15, 2019

max-sixty left a comment

mathause commented Nov 15, 2019

max-sixty commented Nov 15, 2019

ensure rename does not change index type #3532

ensure rename does not change index type #3532

Conversation

mathause commented Nov 14, 2019

mathause Nov 14, 2019 • edited Loading

Choose a reason for hiding this comment

max-sixty Nov 14, 2019

Choose a reason for hiding this comment

mathause Nov 14, 2019

Choose a reason for hiding this comment

mathause commented Nov 14, 2019

mathause commented Nov 14, 2019 • edited Loading

mathause commented Nov 14, 2019

max-sixty commented Nov 14, 2019

mathause commented Nov 15, 2019

max-sixty left a comment

Choose a reason for hiding this comment

mathause commented Nov 15, 2019

max-sixty commented Nov 15, 2019

mathause Nov 14, 2019 •

edited

Loading

mathause commented Nov 14, 2019 •

edited

Loading