Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Access multiindex by name #14152

Closed
dov opened this issue Sep 5, 2016 · 4 comments
Closed

Feature request: Access multiindex by name #14152

dov opened this issue Sep 5, 2016 · 4 comments

Comments

@dov
Copy link

dov commented Sep 5, 2016

Code Sample, a copy-pastable example if possible

I have not found an easy way to access a column of a multi index by name. Consider the following code:

import pandas as pd
df = pd.DataFrame([[1,1,10,200],
                   [1,2,11,201],
                   [1,3,12,202],
                   [2,1,13,210],
                   [2,2,14,230]],
                  columns=list('ABCD')).set_index(['A','B'])
ii = df.index

To access the values of column B i can e.g. do:

df.reset_index('B').B.tolist()

or

[v[1] for v in ii.tolist()]

Both of there are somewhat cumbersome What I'm proposing is to add similar shorcut access to a multiindex just like a dataframe. E.g.

ii.B

or

ii['B']

that would return the same list as in the two examples above.

output of pd.show_versions()

0.18.1

@jorisvandenbossche
Copy link
Member

Related issue / possibly duplicate: #10816

@jreback
Copy link
Contributor

jreback commented Sep 5, 2016

this is the idiom, which is not bad actually.

In [3]: df.index.get_level_values('B')
Out[3]: Int64Index([1, 2, 3, 1, 2], dtype='int64', name=u'B')

Returning a list is non-performant / not-idiomatic.

@jreback jreback closed this as completed Sep 5, 2016
@jreback jreback added this to the No action milestone Sep 5, 2016
@dov
Copy link
Author

dov commented Sep 6, 2016

Nice! I missed that idiom.

But, still, is there any reason that:

df.index.B 

is not a shortcut for df.index.get_level_values('B')? What makes the dataframe special?

@jreback
Copy link
Contributor

jreback commented Sep 6, 2016

So doing df.index.B looks simple and generally it is, however there are some edge cases that make this untenable:

df.index['B'] needs to work similarly for a smooth user xp, however this is in direct conflict with positional access on a single level for a regular Index and on a MI:

In [7]: df.index[0]
Out[7]: (1, 1)

In [8]: df.index[[True]*len(df.index)]
Out[8]: 
MultiIndex(levels=[[1, 2], [1, 2, 3]],
           labels=[[0, 0, 0, 1, 1], [0, 1, 2, 0, 1]],
           names=[u'A', u'B'])

You might think that is ok, until you realize that 0 also could mean level=0. Then what do you do?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants