Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Inconsistency when string refers both to index level and column label #34791

Closed
ChrisStuff opened this issue Jun 15, 2020 · 2 comments
Closed
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@ChrisStuff
Copy link

ChrisStuff commented Jun 15, 2020

While groupby throws a ValueError when the string parameter refers both to an index level and column name, query gives the column precedence in such a case:

>>> df = pd.DataFrame({'a':[1,2,3,4,5], 'b':[3,3,3,3,3]})
>>> df.index.name = 'a'
>>> df.query('a < b')
   a  b
a
0  1  3
1  2  3
>>> df.groupby('a') 
Traceback (most recent call last):
ValueError: 'a' is both an index level and a column label, which is ambiguous.

I am sure there are many other places where a string can refer to both an index level and column name. #27652 and #8162 are somehow linked.

Even though those issues might not be solved, it would be desirable to handle the above ambiguity consistently in places where both index levels and column labels can be referenced by a string argument.

@ChrisStuff ChrisStuff added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 15, 2020
@ChrisStuff ChrisStuff changed the title ENH: Inconsistency when string refers both to index level and column label label:"API - Consistency" ENH: Inconsistency when string refers both to index level and column label Jun 15, 2020
@ChrisStuff ChrisStuff changed the title label:"API - Consistency" ENH: Inconsistency when string refers both to index level and column label ENH: Inconsistency when string refers both to index level and column label Jun 15, 2020
@jreback
Copy link
Contributor

jreback commented Jun 15, 2020

please an example

@TomAugspurger
Copy link
Contributor

I think this one is not ambiguous since df.query('a < b') is equivalent to df['a'] < df['b'] and there's no ambiguity about what df['a'] refers to.

The groupby case is ambiguous since by='a' can be either the column or index level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

3 participants