Test suite improvements #6364

stinodego · 2023-01-22T03:16:55Z

I am planning to do some work on improving the test suite. Tracking my plans and progress in this issue, so that others may comment.

Use pytest-xdist to parallelize running our tests
- We are running a lot of tests, this will help keep things manageable.
- This crashes on some IO tests currently. ~~I plan to move those tests out of the 'regular' flow, so that we can run make test in parallel~~
Use pytest custom markers where appropriate
- This will make it easier to run specific parts of the test suite, for example "everything except parametric tests"
- Unfortunately, I haven't found a way to simply mark all tests in a directory, so we'll have to see if this really adds utility
Rethink the setup of IO tests
- I think we can improve here by utilizing TemporaryDirectory and otherwise making sure tests can run independently.
Get rid of verify_series_and_expr_api. All Series methods that have a matching Expr method already dispatch to it - no need to run these tests twice.
Profile tests using pytest --durations X and mark the slowest tests as 'slow' to remove them from the regular run.
Rethink the folder/file structure
- I think we can do a bit better here; the current setup is not too intuitive.
Better utilize pytest tools like parametrize and fixture.
Better utilize our own testing tools
- We have things like assert_frame_equal and assert_series_equal, but don't use it consistently.
- We often check only shapes or rows because specifying a full frame can be cumbersome. If we formalize this somehow, it will make writing our tests a lot easier.
~~Get rid of the doctest: +IGNORE_RESULT directive~~
- My main motivation for doing this was to be able to run doctests within the pytest suite. While this is possible, configuration options are limited/cumbersome. For example, we'd need a conftest.py in the py-polars directory to inject from polars import pl for each test. I don't like tests leaking out into the main folder structure. I think we are better off with our current approach.

Input is welcome!

The text was updated successfully, but these errors were encountered:

zundertj · 2023-01-22T16:15:48Z

On the doctests: it would be nice to have it as part of the same test suite, but I don't think there is much added value in doing so? In particular, I like the split as it is because the temptation will otherwise be to use the doctests as the primary tool to drive test coverage and skip on the unit test altogether. Instead, the doc examples should be geared towards documenting the function and giving examples of where a function may be useful. That partly overlaps with the goals of a unit test, but not completely?

ghuls · 2023-01-22T17:36:34Z

We often check only shapes or rows because specifying a full frame can be cumbersome. If we formalize this somehow, it will make writing our tests a lot easier.

This has hidden quite some bugs in the past as in a lot of cases dtypes were not checked.
I think it would also be a good idea to have one or more standard test dataframes with all currently supported dtypes and have tests for all of them for functionality where it makes sense. Most tests just use datatypes with very simple dtypes.

And for properly testing the streaming operations, much bigger dataframes are required to trigger all codepaths, I assume.

stinodego · 2023-01-23T18:51:36Z

On the doctests: it would be nice to have it as part of the same test suite, but I don't think there is much added value in doing so? In particular, I like the split as it is because the temptation will otherwise be to use the doctests as the primary tool to drive test coverage and skip on the unit test altogether. Instead, the doc examples should be geared towards documenting the function and giving examples of where a function may be useful. That partly overlaps with the goals of a unit test, but not completely?

I agree that test coverage should come from 'actual' tests in the tests folder, not from doc examples. That's exactly why I would like to get rid of the +IGNORE_RESULT directive; the method should already be tested in the unittests, so if the doc example gives inconsistent results, might as well just skip it.

I think it'll just be a bit cleaner to have pytest handle all the testing; configuration can then be in the pyproject.toml instead of in a random Python file in the tests folder. We can just exclude doctests from the make coverage command, and we're good to go.

zundertj · 2023-01-23T21:41:47Z

IGNORE_RESULT allows us to check that the code does not error, which was, at least at the time, happening for a number of examples. Skipping altogether means that there is no validation that the examples even run. Probably we should try to minimize the number of examples where this is needed in the first place to minimize the impact.

stinodego · 2023-07-12T08:04:24Z

There is still work to be done, but I'll close this for now as the most pressing concerns have been addressed.

stinodego added test Related to the test suite development labels Jan 22, 2023

stinodego self-assigned this Jan 22, 2023

stinodego mentioned this issue Jan 22, 2023

Use pytest markers to distinguish between test types #5064

Closed

This was referenced Feb 2, 2023

test(python): Use pytest marker for slow tests #6642

Merged

test(python): Split long test modules (namespaces) #6668

Merged

test(python): Reorganize benchmark test folder #6695

Merged

stinodego mentioned this issue Feb 17, 2023

test(python): More test restructure #6961

Merged

stinodego closed this as completed Jul 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test suite improvements #6364

Test suite improvements #6364

stinodego commented Jan 22, 2023 •

edited

Loading

zundertj commented Jan 22, 2023

ghuls commented Jan 22, 2023 •

edited by stinodego

Loading

stinodego commented Jan 23, 2023

zundertj commented Jan 23, 2023

stinodego commented Jul 12, 2023

Test suite improvements #6364

Test suite improvements #6364

Comments

stinodego commented Jan 22, 2023 • edited Loading

zundertj commented Jan 22, 2023

ghuls commented Jan 22, 2023 • edited by stinodego Loading

stinodego commented Jan 23, 2023

zundertj commented Jan 23, 2023

stinodego commented Jul 12, 2023

stinodego commented Jan 22, 2023 •

edited

Loading

ghuls commented Jan 22, 2023 •

edited by stinodego

Loading