We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using the latest version of pandas v0.25.1
import numpy as np import pandas as pd np.random.seed(12345) df1 = pd.DataFrame({ 'category': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', ], 'value': np.random.randint(1, 10, 8) }) df1.groupby("category").value.quantile([0.25, 0.75])
produces
category A 0.25 2.75 0.75 5.25 B 0.25 2.75 0.75 6.25 Name: value, dtype: float64
as expected. However, running this
np.random.seed(12345) df2 = pd.DataFrame({ 'category': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'C', 'C', 'C', 'C', ], 'value': np.random.randint(1, 10, 12) }) df2.groupby("category").value.quantile([0.25, 0.75])
produces this error instead:
IndexError Traceback (most recent call last) <ipython-input-60-12c4dbb665fc> in <module> 8 }) 9 ---> 10 df2.groupby("category").value.quantile([0.25, 0.75]) ~/anaconda3/envs/dabest-dev-py3.7/lib/python3.7/site-packages/pandas/core/groupby/groupby.py in quantile(self, q, interpolation) 1951 indices = np.concatenate(arrays) 1952 assert len(indices) == len(result) -> 1953 return result.take(indices) 1954 1955 @Substitution(name="groupby") ~/anaconda3/envs/dabest-dev-py3.7/lib/python3.7/site-packages/pandas/core/series.py in take(self, indices, axis, is_copy, **kwargs) 4430 4431 indices = ensure_platform_int(indices) -> 4432 new_index = self.index.take(indices) 4433 4434 if is_categorical_dtype(self): ~/anaconda3/envs/dabest-dev-py3.7/lib/python3.7/site-packages/pandas/core/indexes/multi.py in take(self, indices, axis, allow_fill, fill_value, **kwargs) 2030 allow_fill=allow_fill, 2031 fill_value=fill_value, -> 2032 na_value=-1, 2033 ) 2034 return MultiIndex( ~/anaconda3/envs/dabest-dev-py3.7/lib/python3.7/site-packages/pandas/core/indexes/multi.py in _assert_take_fillable(self, values, indices, allow_fill, fill_value, na_value) 2058 taken = masked 2059 else: -> 2060 taken = [lab.take(indices) for lab in self.codes] 2061 return taken 2062 ~/anaconda3/envs/dabest-dev-py3.7/lib/python3.7/site-packages/pandas/core/indexes/multi.py in <listcomp>(.0) 2058 taken = masked 2059 else: -> 2060 taken = [lab.take(indices) for lab in self.codes] 2061 return taken 2062 IndexError: index 6 is out of bounds for size 6
The expected output is produced with pandas=0.24:
pandas=0.24
df2.groupby("category").value.quantile([0.25, 0.75])
category A 0.25 2.75 0.75 5.25 B 0.25 2.75 0.75 6.25 C 0.25 1.75 0.75 7.25
Not exactly sure how to mitigate this?
I understand a related bug was patched with #28285 and #27526.
pd.show_versions()
commit : None
pandas : 0.25.1 numpy : 1.16.2 pytz : 2019.2 dateutil : 2.8.0 pip : 19.2.3 setuptools : 41.2.0 Cython : None pytest : 4.3.0 hypothesis : None sphinx : 2.2.0 blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 2.10.1 IPython : 7.2.0 pandas_datareader: None bs4 : None bottleneck : None fastparquet : None gcsfs : None lxml.etree : None matplotlib : 3.1.1 numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pytables : None s3fs : None scipy : 1.2.1 sqlalchemy : None tables : None xarray : None xlrd : 1.2.0 xlwt : None xlsxwriter : None
The text was updated successfully, but these errors were encountered:
Fixed in #28113 I think (0.25.2)
Sorry, something went wrong.
No branches or pull requests
Code Sample
Using the latest version of pandas v0.25.1
produces
as expected. However, running this
produces this error instead:
The expected output is produced with
pandas=0.24
:Not exactly sure how to mitigate this?
I understand a related bug was patched with #28285 and #27526.
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : None
pandas : 0.25.1
numpy : 1.16.2
pytz : 2019.2
dateutil : 2.8.0
pip : 19.2.3
setuptools : 41.2.0
Cython : None
pytest : 4.3.0
hypothesis : None
sphinx : 2.2.0
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.10.1
IPython : 7.2.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : 3.1.1
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.2.1
sqlalchemy : None
tables : None
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : None
The text was updated successfully, but these errors were encountered: