Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: KeyError is raised on .loc for MultiIndex if single keys exist but not the combination #50153

Closed
2 of 3 tasks
fwitte opened this issue Dec 9, 2022 · 2 comments
Closed
2 of 3 tasks
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@fwitte
Copy link

fwitte commented Dec 9, 2022

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd


idx = pd.IndexSlice

index = pd.MultiIndex.from_tuples(
    [("a", "b", "c"), ("a", "a", "c"), ("a", "c", "b")],
    names=["col1", "col2", "col3"]
)

df = pd.DataFrame(index=index, data={"value": [1, 2, 3]})

# works and returns empty dataframe (for every key, that is not in the level "col2")
df.loc[idx[:, ["a"], ["somekey"]]]
# raises KeyError (for every key given for level "col2" that is available in that level, but not in the given combination of keys.
# If the above is itended (returning empty DataFrame), then this should return an empty DataFrame as well
df.loc[idx[:, ["a"], ["b"]]]

Issue Description

See reproducible example.

To fix that the lines

if not np.any(indexer) and np.any(lvl_indexer):
raise KeyError(seq)

should not raise an Exception but return

np.array([], dtype=np.intp)

like when an item cannot be found in the current index level:

if lvl_indexer is None:
# no matches we are done
# test_loc_getitem_duplicates_multiindex_empty_indexer
return np.array([], dtype=np.intp)

Expected Behavior

return an empty dataframe, if the combination of keys in the levels is not available.

>>> df.loc[idx[:, ["a"], ["b"]]]
Empty DataFrame
Columns: [value]
Index: []

Installed Versions

INSTALLED VERSIONS

commit : 8dab54d
python : 3.9.13.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19044
machine : AMD64
processor : Intel64 Family 6 Model 154 Stepping 3, GenuineIntel
byteorder : little
LC_ALL : None
LANG : de_DE.UTF-8
LOCALE : de_DE.cp1252

pandas : 1.5.2
numpy : 1.23.2
pytz : 2022.2.1
dateutil : 2.8.2
setuptools : 65.3.0
pip : 22.2.2
Cython : None
pytest : 7.1.2
hypothesis : None
sphinx : 5.1.1
blosc : None
feather : None
xlsxwriter : 3.0.3
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.4.0
pandas_datareader: None
bs4 : 4.11.1
bottleneck : None
brotli : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.5.3
numba : None
numexpr : 2.8.3
odfpy : None
openpyxl : 3.0.10
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.9.1
snappy : None
sqlalchemy : None
tables : 3.7.0
tabulate : 0.8.10
xarray : 2022.6.0
xlrd : None
xlwt : None
zstandard : None
tzdata : None

@fwitte fwitte added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 9, 2022
fwitte added a commit to fwitte/pandas that referenced this issue Dec 9, 2022
Instead of raising a KeyError in case a key is available in a level
but not found with the combination of previous keys an empty array
is returned.
@fwitte fwitte mentioned this issue Dec 9, 2022
5 tasks
@fwitte
Copy link
Author

fwitte commented Dec 9, 2022

My bad, I confused the pandas installations in my environment and the clone repository. I guess #42351 dealt with it, although I'd have preferred the other way round :(.

Thanks and have a nice weekend!

@phofl
Copy link
Member

phofl commented Dec 9, 2022

Closing then

@phofl phofl closed this as completed Dec 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants