Refactor index-as-string groupby tests and fix spurious warning (Bug 17383) #17843

jonmmease · 2017-10-10T21:06:15Z

closes Groupby with matching column and index name emits spurious warning #17383
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

Test case refactoring:

Moved the existing index-as-string test cases out of test_groupby.py and into a new test_index_as_string.py file
Extracted test data generation functions and parameterized existing test cases to clean them up and shorten them.
Added a new parameterized test case on a Series
Updated test_grouper_column_index_level_precedence to reproduce false warning problem as described in Groupby with matching column and index name emits spurious warning #17383
Updated test_grouper_column_index_level_precedence to verify when warnings should NOT be raised (Results in a test failure due to Groupby with matching column and index name emits spurious warning #17383 without this fix)

- Extract to separate file (test_index_as_string.py) - Parameterize over test DataFrames - Add series test case - Update test_grouper_column_index_level_precedence to reproduce false warning problem as described in GH17383 - Update test_grouper_column_index_level_precedence to verify when warning shouldn't be raised (Results in test failure due to GH17383)

codecov · 2017-10-10T23:23:10Z

Codecov Report

Merging #17843 into master will decrease coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #17843      +/-   ##
==========================================
- Coverage   91.22%   91.22%   -0.01%     
==========================================
  Files         163      163              
  Lines       50014    50075      +61     
==========================================
+ Hits        45627    45679      +52     
- Misses       4387     4396       +9

Flag	Coverage Δ
#multiple	`89.03% <100%> (+0.01%)`	⬆️
#single	`40.32% <0%> (+0.01%)`	⬆️

Impacted Files	Coverage Δ
pandas/core/groupby.py	`91.98% <100%> (-0.02%)`	⬇️
pandas/io/gbq.py	`25% <0%> (-58.34%)`	⬇️
pandas/compat/numpy/function.py	`92.12% <0%> (-1.22%)`	⬇️
pandas/core/indexing.py	`92.82% <0%> (-0.19%)`	⬇️
pandas/io/formats/format.py	`95.94% <0%> (-0.13%)`	⬇️
pandas/core/frame.py	`97.75% <0%> (-0.12%)`	⬇️
pandas/core/computation/align.py	`97.89% <0%> (-0.05%)`	⬇️
pandas/core/reshape/concat.py	`97.57% <0%> (-0.04%)`	⬇️
pandas/core/indexes/base.py	`96.47% <0%> (-0.01%)`	⬇️
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 727ea20...3945107. Read the comment docs.

jonmmease · 2017-10-11T16:02:39Z

pandas/tests/groupby/test_groupby.py

-        with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
-            result = df_multi_both.groupby('inner').mean()
-
-        expected = df_multi_both.groupby([pd.Grouper(key='inner')]).mean()


The pd.Grouper object in this expected expression shouldn't have been wrapped in a list. If it had not been, the spurious warning would have been raised in this test. This is corrected in the new test below.

jonmmease · 2017-10-11T16:02:58Z

pandas/tests/groupby/test_index_as_string.py

+            result = frame.groupby('inner').mean()
+
+        with tm.assert_produces_warning(False):
+            expected = frame.groupby(pd.Grouper(key='inner')).mean()


Note that the pd.Grouper object is no longer wrapped in a list and that we now assert that no warning is raised. This is the test case that would have failed without the fix in this PR.

jreback

some comments

jreback · 2017-10-13T11:12:13Z

pandas/core/groupby.py

@@ -2704,7 +2704,7 @@ def _get_grouper(obj, key=None, axis=0, level=None, sort=True,

    # a passed-in Grouper, directly convert
    if isinstance(key, Grouper):
-        binner, grouper, obj = key._get_grouper(obj)
+        binner, grouper, obj = key._get_grouper(obj, validate=False)


note that I had to add this flag to 'fix' this warning issue elsewhere, I don't really like it, but would require more refactoring to make this cleaner.

jreback · 2017-10-13T11:13:06Z

pandas/tests/groupby/test_index_as_string.py

+
+
+def build_df_multi():
+    idx = pd.MultiIndex.from_tuples([('a', 1), ('a', 2), ('a', 3),


these should just be fixtures

jreback · 2017-10-13T11:13:49Z

pandas/tests/groupby/test_index_as_string.py

+    return series_multi
+
+
+class TestGroupByIndexAsString(object):


no real need to make this a class, that is really a leftover from nose, just make these functions (of course a class is good for grouping generally)

jreback · 2017-10-13T11:14:32Z

pandas/tests/groupby/test_index_as_string.py

+            expected = frame.groupby(pd.Grouper(key='inner')).mean()
+
+        assert_frame_equal(result, expected)
+


happy to have even another level of parameterization to make these shorter here (if that's possible)

jonmmease · 2017-10-13T14:06:47Z

Thanks for the feedback @jreback. I think I've made all the changes you requested and I learned some things about pytest along the way.

jreback · 2017-10-14T14:54:50Z

thank @jmmease nice patch! keep em coming!

…17383) (pandas-dev#17843)

* upstream/master: (76 commits) CategoricalDtype construction: actually use fastpath (pandas-dev#17891) DEPR: Deprecate tupleize_cols in to_csv (pandas-dev#17877) BUG: Fix wrong column selection in drop_duplicates when duplicate column names (pandas-dev#17879) DOC: Adding examples to update docstring (pandas-dev#16812) (pandas-dev#17859) TST: Skip if no openpyxl in test_excel (pandas-dev#17883) TST: Catch read_html slow test warning (pandas-dev#17874) flake8 cleanup (pandas-dev#17873) TST: remove moar warnings (pandas-dev#17872) ENH: tolerance now takes list-like argument for reindex and get_indexer. (pandas-dev#17367) ERR: Raise ValueError when week is passed in to_datetime format witho… (pandas-dev#17819) TST: remove some deprecation warnings (pandas-dev#17870) Refactor index-as-string groupby tests and fix spurious warning (Bug 17383) (pandas-dev#17843) BUG: merging with a boolean/int categorical column (pandas-dev#17841) DEPR: Deprecate read_csv arguments fully (pandas-dev#17865) BUG: to_json - prevent various segfault conditions (GH14256) (pandas-dev#17857) CLN: Use pandas.core.common for None checks (pandas-dev#17816) BUG: set tz on DTI from fixed format HDFStore (pandas-dev#17844) RLS: v0.21.0rc1 Whatsnew cleanup (pandas-dev#17858) DEPR: Deprecate the convert parameter completely (pandas-dev#17831) ...

…17383) (pandas-dev#17843)

Jon M. Mease added 4 commits October 10, 2017 16:43

Fix for GH17383

571a462

Added whatsnew entry

f9ae19a

Missed comment update during refactor

edfbc3f

This was referenced Oct 10, 2017

Groupby with matching column and index name emits spurious warning #17383

Closed

Support merging DataFrames on a combo of columns and index levels (GH 14355) #17484

Merged

jonmmease commented Oct 11, 2017

View reviewed changes

gfyoung added API Design Groupby labels Oct 12, 2017

jreback reviewed Oct 13, 2017

View reviewed changes

Parameterize and fixturize tests

3945107

jreback added this to the 0.21.0 milestone Oct 14, 2017

jreback merged commit e001500 into pandas-dev:master Oct 14, 2017

ghost pushed a commit to reef-technologies/pandas that referenced this pull request Oct 16, 2017

Refactor index-as-string groupby tests and fix spurious warning (Bug …

ad35816

…17383) (pandas-dev#17843)

alanbato pushed a commit to alanbato/pandas that referenced this pull request Nov 10, 2017

Refactor index-as-string groupby tests and fix spurious warning (Bug …

e9820fb

…17383) (pandas-dev#17843)

No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017

Refactor index-as-string groupby tests and fix spurious warning (Bug …

c25a176

…17383) (pandas-dev#17843)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor index-as-string groupby tests and fix spurious warning (Bug 17383) #17843

Refactor index-as-string groupby tests and fix spurious warning (Bug 17383) #17843

jonmmease commented Oct 10, 2017

codecov bot commented Oct 10, 2017 •

edited

Loading

jonmmease Oct 11, 2017

jonmmease Oct 11, 2017

jreback left a comment

jreback Oct 13, 2017

jreback Oct 13, 2017

jreback Oct 13, 2017

jreback Oct 13, 2017

jonmmease commented Oct 13, 2017

jreback commented Oct 14, 2017



		def build_df_multi():
		idx = pd.MultiIndex.from_tuples([('a', 1), ('a', 2), ('a', 3),

		expected = frame.groupby(pd.Grouper(key='inner')).mean()

		assert_frame_equal(result, expected)

Refactor index-as-string groupby tests and fix spurious warning (Bug 17383) #17843

Refactor index-as-string groupby tests and fix spurious warning (Bug 17383) #17843

Conversation

jonmmease commented Oct 10, 2017

codecov bot commented Oct 10, 2017 • edited Loading

Codecov Report

jonmmease Oct 11, 2017

Choose a reason for hiding this comment

jonmmease Oct 11, 2017

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

jreback Oct 13, 2017

Choose a reason for hiding this comment

jreback Oct 13, 2017

Choose a reason for hiding this comment

jreback Oct 13, 2017

Choose a reason for hiding this comment

jreback Oct 13, 2017

Choose a reason for hiding this comment

jonmmease commented Oct 13, 2017

jreback commented Oct 14, 2017

codecov bot commented Oct 10, 2017 •

edited

Loading