Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index.name is not validatied to be Hashable #29069

Closed
TomAugspurger opened this issue Oct 18, 2019 · 5 comments · Fixed by #30335
Closed

Index.name is not validatied to be Hashable #29069

TomAugspurger opened this issue Oct 18, 2019 · 5 comments · Fixed by #30335
Labels
Bug Index Related to the Index class or subclasses
Milestone

Comments

@TomAugspurger
Copy link
Contributor

We require that Series.name be hashable. We don't do the same for Index.name. Should we?

In [4]: pd.Index([], name=[])
Out[4]: Index([], dtype='object', name=[])
In [5]: pd.Series([], name=[])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-2c73ddde103e> in <module>
----> 1 pd.Series([], name=[])

~/sandbox/pandas/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    326         generic.NDFrame.__init__(self, data, fastpath=True)
    327
--> 328         self.name = name
    329         self._set_axis(0, index, fastpath=True)
    330

~/sandbox/pandas/pandas/core/generic.py in __setattr__(self, name, value)
   5257             object.__setattr__(self, name, value)
   5258         elif name in self._metadata:
-> 5259             object.__setattr__(self, name, value)
   5260         else:
   5261             try:

~/sandbox/pandas/pandas/core/series.py in name(self, value)
    468     def name(self, value):
    469         if value is not None and not is_hashable(value):
--> 470             raise TypeError("Series.name must be a hashable type")
    471         object.__setattr__(self, "_name", value)
    472

TypeError: Series.name must be a hashable type
@TomAugspurger TomAugspurger added API Design Index Related to the Index class or subclasses labels Oct 18, 2019
@WillAyd
Copy link
Member

WillAyd commented Oct 18, 2019

Makes sense - I would think generically any labels should have this validation

@R1j1t
Copy link
Contributor

R1j1t commented Oct 22, 2019

I was looking at this issue and would it be fine if i update this line to

Current version

        if name is None and hasattr(data, "name"):
            name = data.name

Proposed solution

        if name is None and hasattr(data, "name"):
            if not is_hashable(data.name):
                raise TypeError("Index.name must be a hashable type")
            name = data.name

I am happy to raise a PR after adding appropriate tests.

any labels should have this validation

@WillAyd could you help me understand if you have any other function/labels in mind?

@TomAugspurger
Copy link
Contributor Author

I'd recommend moving the Series.name implementation to pandas.core.base.IndexOpsMixin, which Series and Index inherit from.

@R1j1t
Copy link
Contributor

R1j1t commented Oct 23, 2019

Okay, so if i understood correctly i thought something like this should work

    @property
    def name(self) -> Optional[Hashable]:
        return self.attrs.get("name", None)

    @name.setter
    def name(self, value: Optional[Hashable]) -> None:
        if not is_hashable(value):
            raise TypeError("name must be a hashable type")
        self.attrs["name"] = value

moving the above lines from pandas.core.Series.py to pandas.core.base.IndexOpsMixin should solve the problem but it only worked for series. I will try to look into your suggestion. Also my above thought didnt work.

@R1j1t
Copy link
Contributor

R1j1t commented Oct 29, 2019

I tried working with your suggestion of moving Series.name implementation to pandas.core.base.IndexOpsMixin but that would require a decent amount of changes (which I am not confident about).

Also, I looked at the code of pandas/core/indexes/base.py and here they validate name to be Hashable in Index class. Based on this I suggest the following:

Current version

        if name is None and hasattr(data, "name"):
            name = data.name

Proposed solution

        if name is None and hasattr(data, "name"):
            if not is_hashable(data.name):
                raise TypeError("Index.name must be a hashable type")
            name = data.name
        elif name is not None:
            if not is_hashable(name):
                raise TypeError("Index.name must be a hashable type")

If you still feel moving Series.name implementation to pandas.core.base.IndexOpsMixin would be better, then it is not straight forward to me and hence I am not proposing any solution for that. The problem I have, Series does the validation via generic.NDFrame and NDFrame is not inherited by Index. So I will have to duplicate the checking directly to IndexOpsMixin, which I wanted to avoid. Let me know if I am wrong.

jbrockmendel added a commit to jbrockmendel/pandas that referenced this issue Dec 18, 2019
@jreback jreback added this to the 1.0 milestone Dec 20, 2019
keechongtan added a commit to keechongtan/pandas that referenced this issue Dec 29, 2019
…ndexing-1row-df

* upstream/master: (333 commits)
  CI: troubleshoot Web_and_Docs failing (pandas-dev#30534)
  WARN: Ignore NumbaPerformanceWarning in test suite (pandas-dev#30525)
  DEPR: camelCase in offsets, get_offset (pandas-dev#30340)
  PERF: implement scalar ops blockwise (pandas-dev#29853)
  DEPR: Remove Series.compress (pandas-dev#30514)
  ENH: Add numba engine for rolling apply (pandas-dev#30151)
  [ENH] Add to_markdown method (pandas-dev#30350)
  DEPR: Deprecate pandas.np module (pandas-dev#30386)
  ENH: Add ignore_index for df.drop_duplicates (pandas-dev#30405)
  BUG: The setting xrot=0 in DataFrame.hist() doesn't work with by and subplots pandas-dev#30288 (pandas-dev#30491)
  CI: Fix GBQ Tests (pandas-dev#30478)
  Bug groupby quantile listlike q and int columns (pandas-dev#30485)
  ENH: Add ignore_index for df.sort_values and series.sort_values (pandas-dev#30402)
  TYP: Typing hints in pandas/io/formats/{css,csvs}.py (pandas-dev#30398)
  BUG: raise on non-hashable Index name, closes pandas-dev#29069 (pandas-dev#30335)
  Replace "foo!r" to "repr(foo)" syntax pandas-dev#29886 (pandas-dev#30502)
  BUG: preserve EA dtype in transpose (pandas-dev#30091)
  BLD: add check to prevent tempita name error, clsoes pandas-dev#28836 (pandas-dev#30498)
  REF/TST: method-specific files for test_append (pandas-dev#30503)
  marked unused parameters (pandas-dev#30504)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Index Related to the Index class or subclasses
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants