Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concat over MultiIndex does not hierarchify further #689

Closed
twiecki opened this issue Jan 26, 2012 · 5 comments
Closed

concat over MultiIndex does not hierarchify further #689

twiecki opened this issue Jan 26, 2012 · 5 comments

Comments

@twiecki
Copy link
Contributor

twiecki commented Jan 26, 2012

df2
Out[62]: 
          A   B      C       D       
level1a 0 foo one   -0.3364 -1.212780
        1 bar one    1.5044  0.832076
level1b 2 foo two    0.1210 -0.547690
        3 bar three -0.6163  0.948797
level1c 4 foo two    0.4783 -1.295089
        5 bar two   -0.6558 -1.515122
level1d 6 foo one   -1.1923 -0.884387
        7 foo three -0.6004  0.003952

pd.concat([df2[0:4], df2[4:8]], keys=['level0a', 'level0b'])
Out[66]: 
                  A   B      C       D       
level0a level1a 0 foo one   -0.3364 -1.212780
                1 bar one    1.5044  0.832076
level0a level1b 2 foo two    0.1210 -0.547690
                3 bar three -0.6163  0.948797
level0b level1c 4 foo two    0.4783 -1.295089
                5 bar two   -0.6558 -1.515122
level0b level1d 6 foo one   -1.1923 -0.884387
                7 foo three -0.6004  0.003952

While I would expect that there are only one of level0a and level0b at the first level.

@wesm
Copy link
Member

wesm commented Jan 26, 2012

The output is exactly what I would expect:

In [8]: concat([df2[0:4], df2[4:8]], keys=['level0a', 'level0b']).index
Out[8]: 
MultiIndex([('level0a', 'level1a', 0), ('level0a', 'level1a', 1),
       ('level0a', 'level1b', 2), ('level0a', 'level1b', 3),
       ('level0b', 'level1c', 4), ('level0b', 'level1c', 5),
       ('level0b', 'level1d', 6), ('level0b', 'level1d', 7)], dtype=object)

Could you explain more what you mean / what is the issue?

@twiecki
Copy link
Contributor Author

twiecki commented Jan 26, 2012

I would expect the following output:

                  A   B      C       D       
level0a level1a 0 foo one   -0.3364 -1.212780
                1 bar one    1.5044  0.832076
        level1b 2 foo two    0.1210 -0.547690
                3 bar three -0.6163  0.948797
level0b level1c 4 foo two    0.4783 -1.295089
                5 bar two   -0.6558 -1.515122
        level1d 6 foo one   -1.1923 -0.884387
                7 foo three -0.6004  0.003952

The indexes are correct, I just wonder why there is redundancy when displaying the upmost level.

@wesm
Copy link
Member

wesm commented Jan 26, 2012

Ah. That is just how it's outputted. Initially I thought that at every "tick" outside the lowest level that displaying all the levels would aid readability. But maybe not. I don't have a strong opinion on the matter honestly

@twiecki
Copy link
Contributor Author

twiecki commented Jan 26, 2012

For what it's worth, I'd personally prefer it the other way around. I
think it clutters up the display, you additionally have to parse where
the next item starts by comparing them. And, I think it would be more
consistent.

On Thu, Jan 26, 2012 at 5:06 PM, Wes McKinney
reply@reply.github.com
wrote:

Ah. That is just how it's outputted. Initially I thought that at every "tick" outside the lowest level that displaying all the levels would aid readability. But maybe not. I don't have a strong opinion on the matter honestly


Reply to this email directly or view it on GitHub:
https://github.com/wesm/pandas/issues/689#issuecomment-3677748

@twiecki
Copy link
Contributor Author

twiecki commented Jan 27, 2012

Thanks for making the change, looks much better now I think.

@twiecki twiecki closed this as completed Jan 27, 2012
yarikoptic added a commit to neurodebian/pandas that referenced this issue Feb 10, 2012
* commit 'v0.7.0rc1-73-g69d5bd8': (44 commits)
  BUG: integer slices should never access label-indexing, GH pandas-dev#700
  BUG: pandas-dev#680 clean up with check for py3compat
  BUG: pandas-dev#680 rears again. cut off another hydra head
  ENH: change to tree-like MultiIndex output with > 2 levels, GH pandas-dev#689
  TST: added a test related to pandas-dev#680
  BUG: related to closes pandas-dev#691, removed cruft
  BUG: closes pandas-dev#691, assignment with ix and mixed dtypes
  BUG: handle incomparable values when creating Factor, caused bug in py3
  TST: Fixes for tests on Python 3.
  BUG: pandas-dev#680, print consistently when dataframe is empty
  TST: unit test for PR pandas-dev#684
  ENH: Allow Series.to_csv to ignore the index.
  BUG: raise exception in DateRange with MonthEnd(0) instead of infinite loop, GH pandas-dev#683
  BUG: unbox 0-dimensional arrays in map_infer, GH pandas-dev#690
  updated license and credits for overview
  ENH: cythonize timestamp conversion in HDFStore
  TST: ok, this appears to work GH pandas-dev#680
  TST: even more woes GH pandas-dev#680
  TST: unicode woes on windoze GH pandas-dev#680
  TST: unicode codec test issue, GH pandas-dev#680
  ...
dan-nadler pushed a commit to dan-nadler/pandas that referenced this issue Sep 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants