Forbid modifying names of DataTree objects with parents #9494

shoyer · 2024-09-13T22:20:52Z

This ensures that the tree cannot be in an inconsistent state.

Closes Consistency of names of stored children / dataarrays with keys in wrapping object #9447
Tests added

This ensures that the tree cannot be in an inconsistent state. Fixes pydata#9447

TomNicholas

This PR has a lot in common with #9492, but I say just merge this one first and I will update the other one to match.

TomNicholas · 2024-09-14T00:43:56Z

xarray/core/treenode.py

+        _validate_name(name)  # is this check redundant?
+        self._name = name


Yes it's redundant, at least in the sense that the original code uses the .name property setter, which now calls _validate_name. Other than that it is necessary though, because you can get down to this method by following

xarray/xarray/core/treenode.py

Line 180 in 18e5c87

child._set_parent(new_parent=self, child_name=name)

which is in the .children setter.

Actually you could instead move this check to be inside the ._check_children staticmethod on TreeNode, which will get called first. I think that would be a clearer place to put the check, and you should then be able to remove the NamedNode._post_attach method entirely!

As an aside I think this brings us closer to seeing that NamedNode is almost entirely unnecessary, if it weren't for the possibility of a root node having a name. Using TreeNode alone nodes would still have names, but those names are stored only as child keys on the parent.

So it makes sense that here we are able to refactor such that every name validation is actually done on TreeNode, with the exceptions of inside NamedNode.__init__ and the .name setter, which are the only places we could possibly check the name of the root node (as it has no parent to perform the validation via).

hmm was I wrong about this?

it does seems so... honestly I find the various hooks on the base TreeNode pretty hard to reason about

Damn. I guess just put the validation check back in, merge, then we can refactor with that regression test to adhere to?

I find the various hooks on the base TreeNode pretty hard to reason about

I normally find them okay - I want to look at this issue more closely. Of all the hooks defined on TreeNode we only actually use like 2 of them, the rest are holdovers from the original design copying the anytree library.

xarray/core/treenode.py

Co-authored-by: Tom Nicholas <tom@cworthy.org>

* main: (26 commits) Forbid modifying names of DataTree objects with parents (pydata#9494) DAS-2155 - Merge datatree documentation into main docs. (pydata#9033) Make illegal path-like variable names when constructing a DataTree from a Dataset (pydata#9378) Ensure TreeNode doesn't copy in-place (pydata#9482) `open_groups` for zarr backends (pydata#9469) Update pyproject.toml (pydata#9484) New whatsnew section (pydata#9483) Release notes for v2024.09.0 (pydata#9480) Fix `DataTree.coords.__setitem__` by adding `DataTreeCoordinates` class (pydata#9451) Rename DataTree's "ds" and "data" to "dataset" (pydata#9476) Update DataTree repr to indicate inheritance (pydata#9470) Bump pypa/gh-action-pypi-publish in the actions group (pydata#9460) Repo checker (pydata#9450) Add days_in_year and decimal_year to dt accessor (pydata#9105) remove parent argument from DataTree.__init__ (pydata#9465) Fix inheritance in DataTree.copy() (pydata#9457) Implement `DataTree.__delitem__` (pydata#9453) Add ASV for datatree.from_dict (pydata#9459) Make the first argument in DataTree.from_dict positional only (pydata#9446) Fix typos across the code, doc and comments (pydata#9443) ...

* main: Opt out of floor division for float dtype time encoding (pydata#9497) fixed formatting for whats-new (pydata#9493) Forbid modifying names of DataTree objects with parents (pydata#9494) DAS-2155 - Merge datatree documentation into main docs. (pydata#9033) Make illegal path-like variable names when constructing a DataTree from a Dataset (pydata#9378) Ensure TreeNode doesn't copy in-place (pydata#9482) `open_groups` for zarr backends (pydata#9469) Update pyproject.toml (pydata#9484) New whatsnew section (pydata#9483)

Forbid modifying names of DataTree objects with parents

7bdfb75

This ensures that the tree cannot be in an inconsistent state. Fixes pydata#9447

shoyer requested a review from TomNicholas September 13, 2024 22:20

TomNicholas approved these changes Sep 14, 2024

View reviewed changes

TomNicholas added the topic-DataTree Related to the implementation of a DataTree class label Sep 14, 2024

shoyer and others added 4 commits September 13, 2024 21:21

Apply suggestions from code review

c035f18

Co-authored-by: Tom Nicholas <tom@cworthy.org>

clean-ups per review

25cd44b

revert _post_attach

5ed32fd

fix assert message

8020901

shoyer merged commit 8db6bc9 into pydata:main Sep 14, 2024
28 checks passed

TomNicholas added a commit to TomNicholas/xarray that referenced this pull request Sep 16, 2024

use _validate_name function from pydata#9494

e37fc64

TomNicholas added a commit to TomNicholas/xarray that referenced this pull request Sep 16, 2024

add tests form pydata#9494 but at NamedNode level

9e2bce7

TomNicholas mentioned this pull request Sep 16, 2024

Forbid slashes in datatree coordinate names #9492

Open

4 tasks

hollymandel pushed a commit to hollymandel/xarray that referenced this pull request Sep 23, 2024

Forbid modifying names of DataTree objects with parents (pydata#9494)

8e7d876

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Forbid modifying names of DataTree objects with parents #9494

Forbid modifying names of DataTree objects with parents #9494

shoyer commented Sep 13, 2024

TomNicholas left a comment

TomNicholas Sep 14, 2024 •

edited

Loading

TomNicholas Sep 14, 2024 •

edited

Loading

TomNicholas Sep 14, 2024

TomNicholas Sep 14, 2024

shoyer Sep 14, 2024

TomNicholas Sep 14, 2024

		_validate_name(name) # is this check redundant?
		self._name = name

Forbid modifying names of DataTree objects with parents #9494

Forbid modifying names of DataTree objects with parents #9494

Conversation

shoyer commented Sep 13, 2024

TomNicholas left a comment

Choose a reason for hiding this comment

TomNicholas Sep 14, 2024 • edited Loading

Choose a reason for hiding this comment

TomNicholas Sep 14, 2024 • edited Loading

Choose a reason for hiding this comment

TomNicholas Sep 14, 2024

Choose a reason for hiding this comment

TomNicholas Sep 14, 2024

Choose a reason for hiding this comment

shoyer Sep 14, 2024

Choose a reason for hiding this comment

TomNicholas Sep 14, 2024

Choose a reason for hiding this comment

TomNicholas Sep 14, 2024 •

edited

Loading

TomNicholas Sep 14, 2024 •

edited

Loading