Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame.xs() throws value error with non-unique multi index if level kwarg is called. #13719

Closed
lucashtnguyen opened this issue Jul 20, 2016 · 3 comments · Fixed by #22294
Closed
Labels
Bug good first issue MultiIndex Testing pandas testing functions or related to the test suite
Milestone

Comments

@lucashtnguyen
Copy link

DataFrame.xs() returns a value error if a non-unique multi-index is used and the level kwarg is called. However, if the level karg is not called, or a tuple is passed the function behaves normally.

import sys

if sys.version[0] == '2':
    from StringIO import StringIO
if sys.version[0] == '3':
    from io import StringIO

import pandas as pd

sstring = """\
m,p,tstep,value,jday,normed_value,datetime
6,407,0,1,564.5,5.75,1964-07-18 12:00:00
6,407,0,1,564.5,5.75,1964-07-18 12:00:00
7,407,0,1,564.5,5.75,1964-07-18 12:00:00
8,408,0,1,564.5,6.75,1964-07-18 12:00:00
"""

subset = pd.read_csv(StringIO(sstring),
                     index_col=['m', 'p'],
                     parse_dates=['datetime'])

Example output of various use cases

Working example 1
subset.xs(6)

     tstep  value   jday  normed_value            datetime
p                                                         
407      0      1  564.5          5.75 1964-07-18 12:00:00
407      0      1  564.5          5.75 1964-07-18 12:00:00

Working example 2
subset.xs((6,), level=['m'])

     tstep  value   jday  normed_value            datetime
p                                                         
407      0      1  564.5          5.75 1964-07-18 12:00:00
407      0      1  564.5          5.75 1964-07-18 12:00:00

Working example 3
subset.xs((6, 407), level=['m', 'p'])

       tstep  value   jday  normed_value            datetime
m p                                                         
6 407      0      1  564.5          5.75 1964-07-18 12:00:00
  407      0      1  564.5          5.75 1964-07-18 12:00:00

Value error

I would expect this to return the same as example 1 or 2

subset.xs(6, level='m')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-1-1d10441983d6> in <module>()
     20                      parse_dates=['datetime'])
     21 
---> 22 subset.xs(6, level='m')

C:\Anaconda3\lib\site-packages\pandas\core\generic.py in xs(self, key, axis, level, copy, drop_level)
   1734 
   1735             result = self.ix[indexer]
-> 1736             setattr(result, result._get_axis_name(axis), new_ax)
   1737             return result
   1738 

C:\Anaconda3\lib\site-packages\pandas\core\generic.py in __setattr__(self, name, value)
   2683         try:
   2684             object.__getattribute__(self, name)
-> 2685             return object.__setattr__(self, name, value)
   2686         except AttributeError:
   2687             pass

pandas\src\properties.pyx in pandas.lib.AxisProperty.__set__ (pandas\lib.c:44748)()

C:\Anaconda3\lib\site-packages\pandas\core\generic.py in _set_axis(self, axis, labels)
    426 
    427     def _set_axis(self, axis, labels):
--> 428         self._data.set_axis(axis, labels)
    429         self._clear_item_cache()
    430 

C:\Anaconda3\lib\site-packages\pandas\core\internals.py in set_axis(self, axis, new_labels)
   2633             raise ValueError('Length mismatch: Expected axis has %d elements, '
   2634                              'new values have %d elements' %
-> 2635                              (old_len, new_len))
   2636 
   2637         self.axes[axis] = new_labels

ValueError: Length mismatch: Expected axis has 4 elements, new values have 2 elements

output of pd.show_versions()

commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None

pandas: 0.18.1
nose: 1.3.7
pip: 8.1.1
setuptools: 20.3
Cython: 0.23.4
numpy: 1.10.4
scipy: 0.17.0
statsmodels: 0.6.1
xarray: None
IPython: 4.1.2
sphinx: 1.3.1
patsy: 0.4.0
dateutil: 2.5.1
pytz: 2016.2
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0
pandas_datareader: None

@jorisvandenbossche
Copy link
Member

Thanks for reporting!

@jorisvandenbossche jorisvandenbossche added this to the Next Major Release milestone Jul 20, 2016
@phobson
Copy link

phobson commented Apr 4, 2017

Just an FYI...I recently ran into this.

@jorisvandenbossche
Copy link
Member

This is actually now working for me on master. So if someone wants to add a test for this?

@jorisvandenbossche jorisvandenbossche added Difficulty Novice Testing pandas testing functions or related to the test suite labels Apr 5, 2017
@jreback jreback modified the milestones: 0.20.1, Next Major Release, Next Minor Release Apr 20, 2017
@jreback jreback modified the milestones: Interesting Issues, Next Major Release Nov 26, 2017
nmusolino pushed a commit to nmusolino/pandas that referenced this issue Aug 12, 2018
nmusolino pushed a commit to nmusolino/pandas that referenced this issue Aug 12, 2018
@jreback jreback modified the milestones: Contributions Welcome, 0.24.0 Sep 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug good first issue MultiIndex Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants