Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: Converting to Pytest Idiom #15990

Closed
9 tasks done
gfyoung opened this issue Apr 13, 2017 · 20 comments
Closed
9 tasks done

TST: Converting to Pytest Idiom #15990

gfyoung opened this issue Apr 13, 2017 · 20 comments
Labels
Docs Testing pandas testing functions or related to the test suite

Comments

@gfyoung
Copy link
Member

gfyoung commented Apr 13, 2017

Our testing framework is a literal hodgepodge of different 3rd-party framework standards (i.e. nosetest, pytest, unittest) that needs to be cleaned up.

A couple of points are clear:

  • Use assert instead of self.assert*...
  • When possible, replace for loops with pytest.mark.parametrize
  • All new test classes should inherit from object instead of tm.TestCase

However, we also have this massive testing module helper testing.py that is populated with tons of useful comparison methods as well as wrappers around basic assert methods that are used inconsistently throughout the code-base. Thus, the question is whether or not we want to keep using those OR remove them entirely and replace them with basic assert statements.

The advantage with using tm.assert statements is that they come directly with error messages instead of an empty AssertionError, which is what I don't like about raw assert statements that are considered more idiomatic with pytest.

Thus, it's a question of greater clarity (i.e. our error messages will be more useful by default when we get test failures) vs. idiomatic (raw assert with no wrapping is the idiomatic way to go but risks presenting less helpful failure messages).

Another point I wanted to bring up is with regards to the introduction of decorators in #15952 to capture stdout and stderr. pytest has fixtures to capture those, but incorporating it into the code-base is very difficult due to our heavy reliance still on unittest.TestCase, which causes incorporation to fail. The question is: do we eventually want to use the pytest fixture or continue using the decorator?

Thoughts?


From the discussion that follows, here are the functions that should be removed and replaced:

  • self (tm).assertRaises --> pytest.raises
  • (NO ACTION) self (tm).assertRaisesRegexp -->
with pytest.raises(cls) as exc_info:
   ...
exc_info.match(regex)

Per discussion in #16089 (comment), we are leaving as is.

  • self.assert* (examples follow)
    • self.assertIs
    • self.assertEqual
    • ...and many others
  • self.assert_* (examples follow)
    • self.assert_numpy_array_equal
    • self.assert_series_equal
    • self.assert_frame_equal
    • self.assert_index_equal
    • ...and many others
  • tm.assert_equal
  • tm.assertIs(Not)
  • tm.assert(Not)In
  • tm.assertIs(Not)None
  • tm.assert(Not)IsInstance
@jreback
Copy link
Contributor

jreback commented Apr 13, 2017

this is already indicated in #15341 and documented. this is orthogonal to #15962 which is fine.

Thus, the question is whether or not we want to keep using those OR remove them entirely and replace them with basic assert statements.

absolutely NOT. not sure where you got this notion.

sure self.assertTrue, self.assertFalse, self.assertRaises, and self.assertEqual when its a scalar should be replaced, but these are just old nose idioms.

Generally replacing test code should be done in independent PRs/commits.

@TomAugspurger TomAugspurger mentioned this issue Apr 13, 2017
17 tasks
@jreback jreback added the Testing pandas testing functions or related to the test suite label Apr 13, 2017
@jreback
Copy link
Contributor

jreback commented Apr 13, 2017

actually another section after http://pandas-docs.github.io/pandas-docs-travis/contributing.html#how-to-use-parametrize would make sense to explain how to write / when to use which assertions, focused on asserting pandas objects (assert_series_equal) and scalars, regexes etc. almost listing what are the 'common' ones would be enough.

@jreback jreback added the Docs label Apr 13, 2017
@jreback jreback added this to the Next Major Release milestone Apr 13, 2017
@TomAugspurger
Copy link
Contributor

TomAugspurger commented Apr 13, 2017

Thus, the question is whether or not we want to keep using those OR remove them entirely and replace them with basic assert statements

Since all the aseert_*_equal methods use assert condition, message, pytest's assertion rewriting is disabled (4th paragraph here)

which is what I don't like about raw assert statements that are considered more idiomatic with pytest.

Pytest has special comparison reprs for builtin data structues, and you can define your own so this could be solved. But we don't want to. We need all the arguments to assert_*_equal like check_dtypes, that you can't specify with ==. We shouldn't ever have bare asserts that aren't being rewritten by pytest.

I can writeup a summary of the comparisons and assertions for the documentation.

@gfyoung
Copy link
Member Author

gfyoung commented Apr 13, 2017

@jreback , @TomAugspurger : So if I'm understanding this properly, we are sticking with raw assert statements when possible because the wrappers will not be properly handled by pytest ?

If that is the case, yes, absolutely, then we should replace using util/testing comparison methods with assert when appropriate in separate PR's (of course, not something like assert_frame_equal but something like assert_equal can then be removed).

@TomAugspurger
Copy link
Contributor

@jreback , @TomAugspurger : So if I'm understanding this properly, we are sticking with raw assert statements when possible because the wrappers will not be properly handled by pytest ?

I think you misunderstood.

For builtins, scalars, etc. we'll recommend assert {'a': 1, 'b': 2} == {'a': 1, 'b': 3} over self.assertDictEqual.

For DataFrames, Series, etc., we'll recommend assert_frame_equal(df1, df2).

I don't think we need to change any of the asserts in util/testing.py. They're already doing the right thing.

@gfyoung
Copy link
Member Author

gfyoung commented Apr 13, 2017

@TomAugspurger : No, I do understand in fact. I am referring to chunks like this and something like this.

@jorisvandenbossche
Copy link
Member

I don't think we need to change any of the asserts in util/testing.py. They're already doing the right thing.

I think @gfyoung meant the assert_equal, assertIsInstance, .. wrappers we have in pandas.util.testing. And for those, yes, I think we should clean that up and remove those.

@max-sixty
Copy link
Contributor

In xarray we built xarray.test.assert_equal and then delegated from there based on type - rather than specific assert_X_equals, which seem redundant

But probably not worth changing now

@gfyoung
Copy link
Member Author

gfyoung commented Apr 13, 2017

wrappers we have in pandas.util.testing. And for those, yes, I think we should clean that up and remove those.

@jorisvandenbossche : Yes, that is correct. Okay, that's what I was hoping to get clarification on.

@gfyoung
Copy link
Member Author

gfyoung commented Apr 13, 2017

Also, do we want to use pytest fixtures to capture stdout and stdin instead of method decorators that were introduced in one of my earlier PR's? OR are we okay with that (no one is commenting on this, so it seems like we are okay with this)?

@jorisvandenbossche
Copy link
Member

Regarding the wrappers in util.testing, note that some of them are already hardly / not used (like assertIs, assertIs) so will be easy to clean up (others like assert_equal are used more).

@jreback
Copy link
Contributor

jreback commented Apr 13, 2017

yes the only things would change are

  • assert_equal (this is generally for scalar / lists /dicts)
  • self.assertEqual
  • self.assertTrue/False
  • assertDictEqual
  • assertIs
  • assertIsInstance

and use pytest.raises

@gfyoung
Copy link
Member Author

gfyoung commented Apr 13, 2017

I would also add the following (static tm. methods):

  • assertIs
  • assertIsNot
  • assertIn
  • assertNotIn
  • assertIsNone
  • assertIsNotNone

@gfyoung
Copy link
Member Author

gfyoung commented Apr 14, 2017

One thing I might also add is that in the spirit of functional, test classes should probably be avoided unless there is a good organization reason for doing so. Otherwise, test functions should be used.

@jreback
Copy link
Contributor

jreback commented Apr 14, 2017

@gfyoung ideally yes, but they are nice for organization, so I wouldn't necessarily change the structure if its meant for organization (and it almost always is).

@gfyoung
Copy link
Member Author

gfyoung commented Apr 14, 2017

@jreback : Yes, of course. That's why I wasn't as definitive about this (also, I added documentation in a PR as you know (#15989) to explicitly to address that caveat).

@gfyoung
Copy link
Member Author

gfyoung commented Apr 26, 2017

To everyone here, what is your opinion about pytest.warns vs. tm.assert_produces_warning ? I know that we have decided to not use pytest.raises in place of tm.assertRaisesRegexp (soon to be tm.assert_raises_regex). I am more inclined to use this former over the latter because it also does message checking which our implementation does not do currently. Thoughts?

gfyoung added a commit to forking-repos/pandas that referenced this issue May 2, 2017
tm.TestCase no longer follows the nosetest idiom,
so it is here to stay, so update the documentation
to say that we are using it still.

Closes pandas-devgh-15990.
gfyoung added a commit to forking-repos/pandas that referenced this issue May 2, 2017
tm.TestCase no longer follows the nosetest idiom,
so it is here to stay, so update the documentation
to say that we are using it still.

Closes pandas-devgh-15990.
gfyoung added a commit to forking-repos/pandas that referenced this issue May 2, 2017
tm.TestCase no longer follows the nosetest idiom,
so it is here to stay, so update the documentation
to say that we are using it still.

Closes pandas-devgh-15990.
gfyoung added a commit to forking-repos/pandas that referenced this issue May 2, 2017
tm.TestCase no longer follows the nosetest idiom,
so it is here to stay, so update the documentation
to say that we are using it still.

Closes pandas-devgh-15990.

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this issue May 2, 2017
tm.TestCase no longer follows the nosetest idiom,
so it is here to stay, so update the documentation
to say that we are using it still.

Closes pandas-devgh-15990.

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this issue May 3, 2017
tm.TestCase no longer follows the nosetest idiom,
so it is here to stay, so update the documentation
to say that we are using it still.

Closes pandas-devgh-15990.
gfyoung added a commit to forking-repos/pandas that referenced this issue May 3, 2017
tm.TestCase no longer follows the nosetest idiom,
so it is here to stay, so update the documentation
to say that we are using it still.

Closes pandas-devgh-15990.
gfyoung added a commit to forking-repos/pandas that referenced this issue May 3, 2017
tm.TestCase no longer follows the nosetest idiom,
so it is here to stay, so update the documentation
to say that we are using it still.

Closes pandas-devgh-15990.
gfyoung added a commit to forking-repos/pandas that referenced this issue May 3, 2017
tm.TestCase no longer follows the nosetest idiom,
so it is here to stay, so update the documentation
to say that we are using it still.

Closes pandas-devgh-15990.
gfyoung added a commit to forking-repos/pandas that referenced this issue May 3, 2017
tm.TestCase no longer follows the nosetest idiom,
so it is here to stay, so update the documentation
to say that we are using it still.

Closes pandas-devgh-15990.
gfyoung added a commit to forking-repos/pandas that referenced this issue May 4, 2017
tm.TestCase no longer follows the nosetest idiom,
so it is here to stay, so update the documentation
to say that we are using it still.

Closes pandas-devgh-15990.
jreback pushed a commit that referenced this issue May 4, 2017
* MAINT: Convert test setup/teardown to pytest idiom

* tm.TestCase now just inherits from object
* setUpClass renamed to setup_class
* tearDownClass renamed to teardown_class
* setUp renamed to setup_method
* tearDown renamed to teardown_method

* MAINT: Remove unittest.TestCase from testing

* DOC: Update documentation for TestCase usage

tm.TestCase no longer follows the nosetest idiom,
so it is here to stay, so update the documentation
to say that we are using it still.

Closes gh-15990.

* TST: Patch Circle matplotlib failure

The tm.mplskip decorator was breaking on Circle,
so this commit removes the decorator and replaces it
with direct function calls to check for matplotlib.

* TST: Replace yield-based tests in test_query_eval
pcluo pushed a commit to pcluo/pandas that referenced this issue May 22, 2017
* MAINT: Convert test setup/teardown to pytest idiom

* tm.TestCase now just inherits from object
* setUpClass renamed to setup_class
* tearDownClass renamed to teardown_class
* setUp renamed to setup_method
* tearDown renamed to teardown_method

* MAINT: Remove unittest.TestCase from testing

* DOC: Update documentation for TestCase usage

tm.TestCase no longer follows the nosetest idiom,
so it is here to stay, so update the documentation
to say that we are using it still.

Closes pandas-devgh-15990.

* TST: Patch Circle matplotlib failure

The tm.mplskip decorator was breaking on Circle,
so this commit removes the decorator and replaces it
with direct function calls to check for matplotlib.

* TST: Replace yield-based tests in test_query_eval
stangirala pushed a commit to stangirala/pandas that referenced this issue Jun 11, 2017
* MAINT: Convert test setup/teardown to pytest idiom

* tm.TestCase now just inherits from object
* setUpClass renamed to setup_class
* tearDownClass renamed to teardown_class
* setUp renamed to setup_method
* tearDown renamed to teardown_method

* MAINT: Remove unittest.TestCase from testing

* DOC: Update documentation for TestCase usage

tm.TestCase no longer follows the nosetest idiom,
so it is here to stay, so update the documentation
to say that we are using it still.

Closes pandas-devgh-15990.

* TST: Patch Circle matplotlib failure

The tm.mplskip decorator was breaking on Circle,
so this commit removes the decorator and replaces it
with direct function calls to check for matplotlib.

* TST: Replace yield-based tests in test_query_eval
@jorisvandenbossche
Copy link
Member

@gfyoung BTW, I just saw that the latest pytest actually supports a pytest.warns(..., match=..) to potentially have the behaviour of tm.assert_raises_regex (one of our custom methods we decided to keep because of this limitation in pytest.warns) (pytest-dev/pytest#2227).

@gfyoung
Copy link
Member Author

gfyoung commented Aug 16, 2017

The PR you pointed to seems to referring to pytest.raises, though this is good to know nevertheless.

@jorisvandenbossche
Copy link
Member

Ah, sorry, yes, that is what I meant :-)
(as otherwise it wouldn't help replacing tm.assert_raises_regex)

But indeed, we also have out own assert_produces_warning, but that is still something else (I think our implementation also change the filter and checks that stacklevel, compared to pytest one)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Testing pandas testing functions or related to the test suite
Projects
None yet
Development

No branches or pull requests

5 participants