Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug groupby quantile listlike q and int columns #30485

Conversation

fujiaxiang
Copy link
Member

@fujiaxiang fujiaxiang commented Dec 26, 2019

When columns are integers, df.groupby(label).quantile(<arraylike>) fails.

@fujiaxiang
Copy link
Member Author

@jreback I know this isn't exactly what you thought of per our discussion here (#30462). This is by far the cleanest implementation I can think of. Reason being:

  1. I'm hoping to keep the function call to _get_cythonized_result the same whether q is scalar or not. Hence we would end of having a list of dataframes. They may have Index or MultiIndex with them so I feel it is not so clean to build index for them one-by-one.
  2. If we concatenate them first, using reorder_levels method seems the most natural way of doing things. There's no clean way (that I know of) to place the quantile index level inside before concatenating them.

On a side note, I refactored the original for-loop which is slow and not as readable and used numpy to construct the same array (indices variable).
Let me know what you think, and I will adjust accordingly!

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fujiaxiang looks pretty good, nice that didn't have to do a major refactor

doc/source/whatsnew/v1.0.0.rst Outdated Show resolved Hide resolved
pandas/core/groupby/groupby.py Show resolved Hide resolved
pandas/core/groupby/groupby.py Show resolved Hide resolved
pandas/core/groupby/groupby.py Outdated Show resolved Hide resolved
@jreback jreback added Bug Groupby MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Dec 26, 2019
pandas/core/groupby/groupby.py Show resolved Hide resolved
pandas/tests/groupby/test_function.py Outdated Show resolved Hide resolved
@jreback jreback added this to the 1.0 milestone Dec 26, 2019
@jreback jreback merged commit 8e9b3ee into pandas-dev:master Dec 27, 2019
@jreback
Copy link
Contributor

jreback commented Dec 27, 2019

thanks @fujiaxiang nicely done! keep em coming!

@fujiaxiang fujiaxiang deleted the bug_groupby_quantile_listlike_q_and_int_columns branch December 27, 2019 16:42
AlexKirko pushed a commit to AlexKirko/pandas that referenced this pull request Dec 29, 2019
keechongtan added a commit to keechongtan/pandas that referenced this pull request Dec 29, 2019
…ndexing-1row-df

* upstream/master: (333 commits)
  CI: troubleshoot Web_and_Docs failing (pandas-dev#30534)
  WARN: Ignore NumbaPerformanceWarning in test suite (pandas-dev#30525)
  DEPR: camelCase in offsets, get_offset (pandas-dev#30340)
  PERF: implement scalar ops blockwise (pandas-dev#29853)
  DEPR: Remove Series.compress (pandas-dev#30514)
  ENH: Add numba engine for rolling apply (pandas-dev#30151)
  [ENH] Add to_markdown method (pandas-dev#30350)
  DEPR: Deprecate pandas.np module (pandas-dev#30386)
  ENH: Add ignore_index for df.drop_duplicates (pandas-dev#30405)
  BUG: The setting xrot=0 in DataFrame.hist() doesn't work with by and subplots pandas-dev#30288 (pandas-dev#30491)
  CI: Fix GBQ Tests (pandas-dev#30478)
  Bug groupby quantile listlike q and int columns (pandas-dev#30485)
  ENH: Add ignore_index for df.sort_values and series.sort_values (pandas-dev#30402)
  TYP: Typing hints in pandas/io/formats/{css,csvs}.py (pandas-dev#30398)
  BUG: raise on non-hashable Index name, closes pandas-dev#29069 (pandas-dev#30335)
  Replace "foo!r" to "repr(foo)" syntax pandas-dev#29886 (pandas-dev#30502)
  BUG: preserve EA dtype in transpose (pandas-dev#30091)
  BLD: add check to prevent tempita name error, clsoes pandas-dev#28836 (pandas-dev#30498)
  REF/TST: method-specific files for test_append (pandas-dev#30503)
  marked unused parameters (pandas-dev#30504)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Groupby MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

groupby.quantile(<arraylike>) fails with AssertionError
2 participants