Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add argument check_dims to assert_allclose to allow transposed inputs (#5733) #8991

Merged
merged 17 commits into from
May 5, 2024

Conversation

ignamv
Copy link
Contributor

@ignamv ignamv commented May 1, 2024

Copy link

welcome bot commented May 1, 2024

Thank you for opening this pull request! It may take us a few days to respond here, so thank you for being patient.
If you have questions, some answers may be found in our contributing guidelines.

Copy link
Member

@TomNicholas TomNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @ignamv !

doc/whats-new.rst Outdated Show resolved Hide resolved
xarray/testing/assertions.py Outdated Show resolved Hide resolved
xarray/testing/assertions.py Outdated Show resolved Hide resolved
xarray/testing/assertions.py Outdated Show resolved Hide resolved
Comment on lines 101 to 104
if not isinstance(a, (Variable, DataArray, Dataset)):
return b
if check_dims == "transpose":
assert set(a.dims) == set(b.dims), f"Dimensions differ: {a.dims} {b.dims}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that Dataset.dims returns an ordered type may no longer be true after #8980, but that's my problem, not yours 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This future change is also raising a warning during the tests, not sure if I should be suppressing it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so we should deal with it now then. Basically Dataset objects don't really have ordered dimensions, and Dataset.dims will be changed to return a set type in future. So check_dim_order can't do anything when comparing dataset objects, the most you can do is check that all the same dims are present.

if check_dims == "transpose":
assert set(a.dims) == set(b.dims), f"Dimensions differ: {a.dims} {b.dims}"
return b.transpose(*a.dims)
assert a.dims == b.dims, f"Dimensions differ: {a.dims} {b.dims}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good - if the dimensions differ there is no point checking anything else.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure we don't have better diffing functions further down the stack? We should ensure the output of an error is no worse with this than with this commented out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usual diff would allow you to see if two datasets somehow got created with the same values but swapped dimensions. If we want this, I can skip the transpose when check_dim_order=False and the dims differ (so it doesn't fail on transpose) and just let it make the whole diff. And I'll always skip the dims check when check_dim_order=True.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I would probably let it fall through, so we have a single place where we check & generate diffs; at least unless there's a good reason not to

(though @TomNicholas let us know if you disagree!)

Co-authored-by: Tom Nicholas <tom@cworthy.org>
@pytest.mark.parametrize(
"func", ["assert_equal", "assert_allclose", "assert_identical"]
)
def test_assert_allclose_equal_identical_transpose(func) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice test the tests!

@@ -162,7 +176,7 @@ def assert_identical(a: DataTree, b: DataTree, from_root: bool = True): ...


@ensure_warnings
def assert_identical(a, b, from_root=True):
def assert_identical(a, b, from_root=True, check_dim_order: bool = True):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it makes sense to have this arg — identical means identical :) — but happy to defer if someone does...

xarray/testing/assertions.py Show resolved Hide resolved
@max-sixty
Copy link
Collaborator

Thanks a lot @ignamv !

@max-sixty max-sixty merged commit 8d728bf into pydata:main May 5, 2024
28 checks passed
Copy link

welcome bot commented May 5, 2024

Congratulations on completing your first pull request! Welcome to Xarray! We are proud of you, and hope to see you again! celebration gif

andersy005 pushed a commit that referenced this pull request May 10, 2024
…#5733) (#8991)

* Add argument check_dims to assert_allclose to allow transposed inputs

* Update whats-new.rst

* Add `check_dims` argument to assert_equal and assert_identical + tests

* Assert that dimensions match before transposing or comparing values

* Add docstring for check_dims to assert_equal and assert_identical

* Update doc/whats-new.rst

Co-authored-by: Tom Nicholas <tom@cworthy.org>

* Undo fat finger

Co-authored-by: Tom Nicholas <tom@cworthy.org>

* Add attribution to whats-new.rst

* Replace check_dims with bool argument check_dim_order, rename align_dims to maybe_transpose_dims

* Remove left-over half-made test

* Remove check_dim_order argument from assert_identical

* assert_allclose/equal: emit full diff if dimensions don't match

* Rename check_dim_order test, test Dataset with different dim orders

* Update whats-new.rst

* Hide maybe_transpose_dims from Pytest traceback

Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com>

* Ignore mypy error due to missing functools.partial.__name__

---------

Co-authored-by: Tom Nicholas <tom@cworthy.org>
Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com>
andersy005 added a commit that referenced this pull request May 10, 2024
* main:
  Avoid auto creation of indexes in concat (#8872)
  Fix benchmark CI (#9013)
  Avoid extra read from disk when creating Pandas Index. (#8893)
  Add a benchmark to monitor performance for large dataset indexing (#9012)
  Zarr: Optimize `region="auto"` detection (#8997)
  Trigger CI only if code files are modified. (#9006)
  Fix for ruff 0.4.3 (#9007)
  Port negative frequency fix for `pandas.date_range` to `cftime_range` (#8999)
  Bump codecov/codecov-action from 4.3.0 to 4.3.1 in the actions group (#9004)
  Speed up localize (#8536)
  Simplify fast path (#9001)
  Add argument check_dims to assert_allclose to allow transposed inputs (#5733) (#8991)
  Fix syntax error in test related to cupy (#9000)
andersy005 added a commit to hmaarrfk/xarray that referenced this pull request May 10, 2024
* backend-indexing:
  Trigger CI only if code files are modified. (pydata#9006)
  Enable explicit use of key tuples (instead of *Indexer objects) in indexing adapters and explicitly indexed arrays (pydata#8870)
  add `.oindex` and `.vindex` to `BackendArray` (pydata#8885)
  temporary enable CI triggers on feature branch
  Avoid auto creation of indexes in concat (pydata#8872)
  Fix benchmark CI (pydata#9013)
  Avoid extra read from disk when creating Pandas Index. (pydata#8893)
  Add a benchmark to monitor performance for large dataset indexing (pydata#9012)
  Zarr: Optimize `region="auto"` detection (pydata#8997)
  Trigger CI only if code files are modified. (pydata#9006)
  Fix for ruff 0.4.3 (pydata#9007)
  Port negative frequency fix for `pandas.date_range` to `cftime_range` (pydata#8999)
  Bump codecov/codecov-action from 4.3.0 to 4.3.1 in the actions group (pydata#9004)
  Speed up localize (pydata#8536)
  Simplify fast path (pydata#9001)
  Add argument check_dims to assert_allclose to allow transposed inputs (pydata#5733) (pydata#8991)
  Fix syntax error in test related to cupy (pydata#9000)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Shoudn't assert_allclose transpose datasets?
4 participants