Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: better error-handling for df.set_index #22484

Closed
h-vetinari opened this issue Aug 23, 2018 · 0 comments · Fixed by #22486
Closed

API: better error-handling for df.set_index #22484

h-vetinari opened this issue Aug 23, 2018 · 0 comments · Fixed by #22486
Labels
Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@h-vetinari
Copy link
Contributor

splitting up #22236.

Let's have
df = pd.DataFrame(np.random.randn(5, 5), columns=list('ABCDE'))

The error handling of df.set_index can be improved in at least three cases:

  1. df.set_index(['A', 'A'], drop=False) works, while
    df.set_index(['A', 'A'], drop=True) yields
    KeyError: 'A'
  2. Objects of unknown type yield KeyError instead of TypeError:
    df.set_index(map(str, df.A))
    KeyError: "None of [Index([...], dtype='object')] are in the [columns]"
  3. df.set_index(['foo', 'bar', 'baz']) only shows one missing key
    KeyError: 'foo' (in a huge stacktrace)

Better would be:

  1. gracefully handle duplicate column names when drop=True
  2. raise better error message, e.g. TypeError: only allowed types are: ...
  3. Show all missing keys: KeyError: "['foo', 'bar', 'baz']"
@h-vetinari h-vetinari changed the title API: improve warnings for df.set_index API: better error-handling for df.set_index Aug 23, 2018
@gfyoung gfyoung added Indexing Related to indexing on series/frames, not to indexes themselves Error Reporting Incorrect or improved errors from pandas labels Aug 25, 2018
@jreback jreback added this to the 0.24.0 milestone Sep 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants