-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IndexError: tuple index out of range after upgrade to 0.25 #27775
Comments
pls update the top section with a reproducible example; links to additional material is fine but the source material and versions should be here |
@jreback I have actually downgraded Pandas from
The reproducible example is actually very simple: import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
headers = ['fx', 'fy', 'fz', 'tx', 'ty', 'tz', 'currentr',
'time', 'theta', 'omegay', 'currenty', 'pr', 'Dc', 'Fr', 'Fl']
df = pd.read_csv('data.csv', names=headers)
fig3 = plt.figure()
plt.plot(df.index, df['time'])
plt.show() nothing particularly specific. more details including the CSV file here. Please let me know if this is this satisfactory. Thanks for your support in advance. |
pls try to reduce this to a copy pastable example w/o any external links |
Dear @jreback , @anntzer has provided a small example showing the different between
|
@Foadsf I updated the top post with that example |
So the root cause is that we don't handle well a 2D indexer on an Index class. The source of pandas/pandas/core/indexes/base.py Lines 4241 to 4242 in 640d9e1
but that clearly does not happen (anymore). |
Though I don't think returning an ndarray is appropriate, right? I'd be surprised to have What's the best path forward? IMO raising is the most correct thing to do. But is it worth changing? |
This was "caused" by #27384, which optimized But of course bottom line is still that an Index with 2D values is an invalid index object:
|
I think short term, the easiest option is to revert the But longer term this is not really a good solution. I suppose the reason that it returned a 2D array before, might have been because it was an ndarray subclass, and in general might be useful to have see the Index as an array-like that behaves in code that expects a numpy-like array. BTW, Series actually does this:
|
The Series case only works for actual numpy dtypes. Eg for categorical it returns a Series but goes wrong in all kinds of ways:
|
From Matplotlib's point of view, returning a numpy array is just fine (as we are trying to duck-type as a |
This is also related to #27125 (the fact that we can create an Index with >1 dimensional array). For a 0.25.1 bugfix release, I would propose to again start returning the 2D shape. |
Root cause (in both cases using
df = pd.DataFrame({'a': [1, 2, 3]})
):vs
So before, indexing with
[:, None]
(in numpy a way to add a dimension to get 2D array) actually resulting in Index with ndim of 2 (but which is of course inconsistent state of the Index object)Matplotlib relied on this fact when an Index is passed to
plt.plot
, as reported in matplotlib/matplotlib#14992I have explained the issue here and here in details. Basically, after upgrading to the version
0.25
I got the error:while attempting to plot a CSV file.
The text was updated successfully, but these errors were encountered: