-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python 3 hashing #1772
Python 3 hashing #1772
Conversation
@@ -146,6 +146,9 @@ def __new__(cls, coord, dims): | |||
kwargs) | |||
return metadata | |||
|
|||
def __hash__(self): | |||
return hash(tuple(self)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really surprised a namedtuple isn't hashable by default...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are: 😕
$ python2
python 2.7.10 ...
>>> import collections
>>> hash(collections.namedtuple('Foo', 'foo bar')(1, 2))
3713081631934410656
$ python3
Python 3.4.3 ...
>>> import collections
>>> hash(collections.namedtuple('Foo', 'foo bar')(1, 2))
3713081631934410656
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are, but when you override __eq__
, you must supply __hash__
as well. Unfortunately, you can't do something like __hash__ = namedtuple.__hash__
because namedtuple
is actually a function and the real class is hidden. This implementation doesn't exactly match __eq__
, but it should match Python 2 behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when you override
__eq__
, you must supply__hash__
as well
Seems to be a runtime check when you try to hash an object...
Python 3.4.3 ...
>>> import collections
>>> Foo = collections.namedtuple('Foo', 'foo bar')
>>> hash(Foo(1, 2))
3713081631934410656
>>> class Bar(Foo): __eq__ = lambda self, other: True
...
>>> hash(Bar(1, 2))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'Bar'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to be a runtime check when you try to hash an object...
Or I could have just read @QuLogic's original commit message. 😒 Doh!
With new-style objects, if
__eq__
is overridden, then__hash__
is automatically undefined.
>>> print(Foo.__hash__)
<slot wrapper '__hash__' of 'tuple' objects>
>>> print(Bar.__hash__)
None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So how about: super(_CoordMetaData, self).__hash__()
?
Also raised in #962. |
@@ -210,6 +210,9 @@ def __add__(self, mod): | |||
bound = tuple([val + mod for val in bound]) | |||
return Cell(point, bound) | |||
|
|||
def __hash__(self): | |||
return hash(tuple(self)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably more efficient as super(Cell, self).__hash__()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so, the same would apply for the other namedtuple subclasses in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it's more efficient, but I think it might be clearer that way. Just running %timeit hash(b)
with the class in your example above shows super
to be maybe 50 ns slower, but the slowest time is anywhere from 8 to 20-25 times slower for either method. So there's some sort of caching, but for that first hit, it seems to be a toss-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for checking 👍
With new-style objects, if __eq__ is overridden, then __hash__ is automatically undefined.
This is simply based on the id, which is nearly equivalent to Python 2's method, but perhaps it could be made more general.
👍 - nice solution |
Re-investigated : this is still an issue.
Updated code comments to reflect it #2553. |
Several classes must be hashable because they are used as dictionary keys. This is generally implemented as a hash of the same information used for
__eq__
, or if that's not defined, theid
.