Fix Ragged tests #1022

ianthomas23 · 2021-09-14T10:55:21Z

There are consistently 14 test failures in CI in the various TestRagged tests in test_datatypes.py:

=========================== short test summary info ============================
FAILED datashader/tests/test_datatypes.py::TestRaggedGetitem::test_getitem_mask_raises
FAILED datashader/tests/test_datatypes.py::TestRaggedGetitem::test_getitem_ellipsis_and_slice
FAILED datashader/tests/test_datatypes.py::TestRaggedGetitem::test_take_empty
FAILED datashader/tests/test_datatypes.py::TestRaggedGroupby::test_groupby_extension_apply[scalar]
FAILED datashader/tests/test_datatypes.py::TestRaggedGroupby::test_groupby_extension_apply[list]
FAILED datashader/tests/test_datatypes.py::TestRaggedGroupby::test_groupby_extension_apply[object]
FAILED datashader/tests/test_datatypes.py::TestRaggedGroupby::test_groupby_extension_agg[False]
FAILED datashader/tests/test_datatypes.py::TestRaggedGroupby::test_groupby_extension_transform
FAILED datashader/tests/test_datatypes.py::TestRaggedMethods::test_sort_values_frame[True]
FAILED datashader/tests/test_datatypes.py::TestRaggedMethods::test_sort_values_frame[False]
FAILED datashader/tests/test_datatypes.py::TestRaggedMissing::test_fillna_limit_pad
FAILED datashader/tests/test_datatypes.py::TestRaggedMissing::test_fillna_limit_backfill
FAILED datashader/tests/test_datatypes.py::TestRaggedMissing::test_fillna_series_method[ffill]
FAILED datashader/tests/test_datatypes.py::TestRaggedMissing::test_fillna_series_method[bfill]
===== 14 failed, 763 passed, 39 skipped, 101 warnings in 372.30s (0:06:12) =====

The failures are due to a tightening-up of numpy and pandas APIs and increased extension testing in pandas.

This PR removes the test failures; there are 4 types of fixes/changes:

Change wording of exceptions.
Replacing use of deprecated pandas.util.testing with equivalent numpy assertion.
Skipping tests for functionality that is known to not be implemented for RaggedArray, such as construction using nested sequences and indexing using ellipsis.
Skipping other tests that I am not 100% sure of but seems consistent with existing skipped tests.

As expert in RaggedArray and pandas extension types could no doubt do a better job with item 4, but in the meantime I would quite like the CI to pass so that I can experiment with adding new functionality.

ianthomas23 · 2021-09-29T09:46:34Z

I think I've dealt with the final test failure. This PR needs approval again for the CI to run...

philippjfr · 2021-09-29T10:25:57Z

@ianthomas23 It seems like we are endlessly chasing pandas expanding test suite for ExtensionArrays. Do you have any thoughts on how we can isolate ourselves from this? It seems kind of silly to me for Pandas to keep adding to the base extension array tests knowing that a bunch of tests they are adding won't be applicable to various subclasses of ExtensionArray other people might implement.

ianthomas23 · 2021-09-29T10:54:48Z

@philippjfr We could pin the version of pandas in setup.py (and maybe other places?) to an upper version that we know the CI passes with. Then contributors PRs won't fail because of pandas changes. But it would mean whenever pandas do a new release someone would have to check the tests pass locally with that new version before changing the upper version allowed in datashader. It looks like pandas release about once a month.

There has been a discussion in numpy/scipy for such upper version pinning in the last year.

ianthomas23 · 2021-09-29T16:03:56Z

Remaining test failures are

TypeError: 'coroutine' object is not subscriptable.

This seems to be a known problem with some combinations of jupyter_client and nbconvert (jupyter/jupyter_client#637). It looks like you have seen this recently on colorcet too: holoviz/colorcet#66.

The solution seems to be to pin the versions of one or the other. There might need to be some experimentation here to find a working combination.

How do you want to proceed? Would you be happy merging this PR and then I can try to deal with it in another PR? This will be easier than adding it to this one as I will no longer be a first-time contributor so I should be able to use github actions without any assistance.

jbednar

Looks good to me; I'll merge and other fixes can be made separately. Thanks, @ianthomas23, and sorry for the delay. Personally, I'd like to get all extension array code out of datashader and into e.g. SpatialPandas, but there are still a few important bits implemented only in Datashader. :-(

datashader/datatypes.py

philippjfr · 2021-09-29T17:58:45Z

Personally, I'd like to get all extension array code out of datashader and into e.g. SpatialPandas, but there are still a few important bits implemented only in Datashader. :-(

Same issues apply there, I don't think pinning pandas in setup.py is a great option, although it might be a good idea to do that in CI, because 99% of the time there's zero issue with the new pandas. It's just various new tests which don't apply to the ragged array and geometry arrays have to be skipped.

ianthomas23 · 2021-09-30T09:48:44Z

@philippjfr I don't have any other cunning ideas for avoiding problems with future pandas releases. Maybe we just leave it as it is but try to check CI failures more frequently.

Here at makepath we intend to use datashader a lot, so maybe we can help with future maintenance workload. I have a bunch of PR ideas lined up so that it is a bit more polished for our use, so I should start to get familiar with the code base.

Fix Ragged tests

1e818af

jbednar mentioned this pull request Sep 29, 2021

handle cupy array in spread and dynspread functions #1015

Merged

Remove duplicated test function

4de94d8

jbednar approved these changes Sep 29, 2021

View reviewed changes

datashader/datatypes.py Outdated Show resolved Hide resolved

Update datashader/datatypes.py

d20b606

jbednar merged commit aea4959 into holoviz:master Sep 29, 2021

ianthomas23 mentioned this pull request Oct 18, 2021

Fix CI part 2 #1025

Merged

bnavigator mentioned this pull request Feb 1, 2022

More TesteRagged failures with pandas 1.4 #1043

Closed

ianthomas23 deleted the fix_ragged_tests branch November 17, 2022 14:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Ragged tests #1022

Fix Ragged tests #1022

ianthomas23 commented Sep 14, 2021

ianthomas23 commented Sep 29, 2021

philippjfr commented Sep 29, 2021

ianthomas23 commented Sep 29, 2021

ianthomas23 commented Sep 29, 2021

jbednar left a comment

philippjfr commented Sep 29, 2021

ianthomas23 commented Sep 30, 2021

Fix Ragged tests #1022

Fix Ragged tests #1022

Conversation

ianthomas23 commented Sep 14, 2021

ianthomas23 commented Sep 29, 2021

philippjfr commented Sep 29, 2021

ianthomas23 commented Sep 29, 2021

ianthomas23 commented Sep 29, 2021

jbednar left a comment

Choose a reason for hiding this comment

philippjfr commented Sep 29, 2021

ianthomas23 commented Sep 30, 2021