Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure TreeNode doesn't copy in-place #9482

Merged
merged 13 commits into from
Sep 12, 2024

Conversation

TomNicholas
Copy link
Member

Solves #9478 by implementing steps 1 through 4 of #9478 (comment). Steps 5 and 6 were not necessary, so TreeNode and NamedNode are still separate.

Changes the in-place behaviour of TreeNode in a way that is now consistent with the changes made to DataTree in #9297. The tests for treenode are changed but as the tests for datatree are not changed, there are no changes to user API.

@TomNicholas TomNicholas added topic-internals topic-DataTree Related to the implementation of a DataTree class labels Sep 11, 2024
Comment on lines +452 to +453
# comes after setting node data as this will check for clashes between child names and existing variable names
super().__init__(name=name, children=children)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you try to swap these back around you get an error from trying to access ._node_coordinates before it has been set. I think this part of the code could be refactored a little (in a follow-up PR). Preferably we would set the data using a codepath that checks for name collisions between variables and children, instead of relying on the assignment of children to do that step.

@TomNicholas TomNicholas mentioned this pull request Sep 11, 2024
4 tasks
@TomNicholas
Copy link
Member Author

@shoyer do you mind if I merge this soon? I want to build atop it (in a few directions).

Copy link
Member

@shoyer shoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great

Comment on lines 780 to 781
new_node = super()._copy_node()
new_node._name = self.name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this redundant with the NamedNode base class?

@TomNicholas TomNicholas enabled auto-merge (squash) September 12, 2024 18:56
@TomNicholas TomNicholas merged commit 4323b19 into pydata:main Sep 12, 2024
27 checks passed
@TomNicholas TomNicholas deleted the treenode_copy branch September 12, 2024 21:46
dcherian added a commit to dcherian/xarray that referenced this pull request Sep 17, 2024
* main: (26 commits)
  Forbid modifying names of DataTree objects with parents (pydata#9494)
  DAS-2155 - Merge datatree documentation into main docs. (pydata#9033)
  Make illegal path-like variable names when constructing a DataTree from a Dataset (pydata#9378)
  Ensure TreeNode doesn't copy in-place (pydata#9482)
  `open_groups` for zarr backends (pydata#9469)
  Update pyproject.toml (pydata#9484)
  New whatsnew section (pydata#9483)
  Release notes for v2024.09.0 (pydata#9480)
  Fix `DataTree.coords.__setitem__` by adding `DataTreeCoordinates` class (pydata#9451)
  Rename DataTree's "ds" and "data" to "dataset" (pydata#9476)
  Update DataTree repr to indicate inheritance (pydata#9470)
  Bump pypa/gh-action-pypi-publish in the actions group (pydata#9460)
  Repo checker (pydata#9450)
  Add days_in_year and decimal_year to dt accessor (pydata#9105)
  remove parent argument from DataTree.__init__ (pydata#9465)
  Fix inheritance in DataTree.copy() (pydata#9457)
  Implement `DataTree.__delitem__` (pydata#9453)
  Add ASV for datatree.from_dict (pydata#9459)
  Make the first argument in DataTree.from_dict positional only (pydata#9446)
  Fix typos across the code, doc and comments (pydata#9443)
  ...
dcherian added a commit to dcherian/xarray that referenced this pull request Sep 17, 2024
* main:
  Opt out of floor division for float dtype time encoding (pydata#9497)
  fixed formatting for whats-new (pydata#9493)
  Forbid modifying names of DataTree objects with parents (pydata#9494)
  DAS-2155 - Merge datatree documentation into main docs. (pydata#9033)
  Make illegal path-like variable names when constructing a DataTree from a Dataset (pydata#9378)
  Ensure TreeNode doesn't copy in-place (pydata#9482)
  `open_groups` for zarr backends (pydata#9469)
  Update pyproject.toml (pydata#9484)
  New whatsnew section (pydata#9483)
hollymandel pushed a commit to hollymandel/xarray that referenced this pull request Sep 23, 2024
* test from pydata#9196 but on TreeNode

* move assignment and copying of children to TreeNode constructor

* move copy methods over to TreeNode

* change copying behaviour to be in line with pydata#9196

* explicitly test that ._copy_subtree works for TreeNode

* reimplement ._copy_subtree using recursion

* change treenode.py tests to match expected non-in-place behaviour

* fix but created in DataTree.__init__

* add type hints for Generic TreeNode back in

* update typing of ._copy_node

* remove redunant setting of _name
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-DataTree Related to the implementation of a DataTree class topic-internals
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TreeNode constructor should not modify children in-place
2 participants