Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3 hashing #1772

Merged
merged 3 commits into from
Sep 10, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions lib/iris/_concatenate.py
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,9 @@ def __new__(cls, coord, dims):
kwargs)
return metadata

def __hash__(self):
return super(_CoordMetaData, self).__hash__()

def __eq__(self, other):
result = NotImplemented
if isinstance(other, _CoordMetaData):
Expand Down
13 changes: 13 additions & 0 deletions lib/iris/coords.py
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,9 @@ def __add__(self, mod):
bound = tuple([val + mod for val in bound])
return Cell(point, bound)

def __hash__(self):
return super(Cell, self).__hash__()

def __eq__(self, other):
"""
Compares Cell equality depending on the type of the object to be
Expand Down Expand Up @@ -1391,6 +1394,11 @@ def __eq__(self, other):

# The __ne__ operator from Coord implements the not __eq__ method.

# This is necessary for merging, but probably shouldn't be used otherwise.
# See #962 and #1772.
def __hash__(self):
return hash(id(self))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that is the intended hash of a coordinate. The py docs state "Hashable objects which compare equal must have the same hash value.", but we have implemented an equality which contradicts this.

The problem of using anything other than id for the hash though is that our coordinates are mutable...

Essentially we are in a bit of a pickle, as coordinates are:

  • mutable
  • hashable
  • implement __eq__, which compares mutable attributes

In the medium term, we may need to rethink the use of coordinates as dictionary keys.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right it's probably inconsistent, but this is the simplest hash and it should be equivalent to what occurs on Python 2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, the "mutable-hashable" was originally a cheap/lazy hack to simplify the cube-merge code. During the merge process the cubes/coords don't change so they are effectively immutable. Unsurprisingly, usage of the "mutable-hashable" behaviour has proliferated (e.g. iris.iterate.izip).

Ideally we should get rid of mutable-hashable behaviour, but not in this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a small comment about it.


def __getitem__(self, key):
coord = super(DimCoord, self).__getitem__(key)
coord.circular = self.circular and coord.shape == self.shape
Expand Down Expand Up @@ -1592,6 +1600,11 @@ def bounds(self, bounds):
"shape.")
self._bounds = bounds

# This is necessary for merging, but probably shouldn't be used otherwise.
# See #962 and #1772.
def __hash__(self):
return hash(id(self))


class CellMethod(iris.util._OrderedHashable):
"""
Expand Down
6 changes: 6 additions & 0 deletions lib/iris/cube.py
Original file line number Diff line number Diff line change
Expand Up @@ -2736,6 +2736,12 @@ def __ne__(self, other):
result = not result
return result

# Must supply __hash__, Python 3 does not enable it if __eq__ is defined
# This is necessary for merging, but probably shouldn't be used otherwise.
# See #962 and #1772.
def __hash__(self):
return hash(id(self))

def __add__(self, other):
return iris.analysis.maths.add(self, other, ignore=True)
__radd__ = __add__
Expand Down
3 changes: 3 additions & 0 deletions lib/iris/fileformats/_structured_array_identification.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,9 @@ def __new__(cls, stride, unique_ordered_values):
self.size = len(unique_ordered_values)
return self

def __hash__(self):
return super(ArrayStructure, self).__hash__()

def __eq__(self, other):
stride = getattr(other, 'stride', None)
arr = getattr(other, 'unique_ordered_values', None)
Expand Down
4 changes: 4 additions & 0 deletions lib/iris/fileformats/cf.py
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,10 @@ def __ne__(self, other):
# CF variable names are unique.
return self.cf_name != other.cf_name

def __hash__(self):
# CF variable names are unique.
return hash(self.cf_name)

def __getattr__(self, name):
# Accessing netCDF attributes is surprisingly slow. Since
# they're often read repeatedly, caching the values makes things
Expand Down
3 changes: 3 additions & 0 deletions lib/iris/fileformats/pp.py
Original file line number Diff line number Diff line change
Expand Up @@ -344,6 +344,9 @@ def lbuser6(self):
def is_valid(self):
return '?' not in str(self)

def __hash__(self):
return super(STASH, self).__hash__()

def __eq__(self, other):
if isinstance(other, six.string_types):
return super(STASH, self).__eq__(STASH.from_msi(other))
Expand Down
2 changes: 2 additions & 0 deletions lib/iris/unit.py
Original file line number Diff line number Diff line change
Expand Up @@ -1741,6 +1741,8 @@ def _identity(self):
# iris.util._OrderedHashable.
return (self.name, self.calendar)

__hash__ = iris.util._OrderedHashable.__hash__

def __eq__(self, other):
"""
Compare the two units for equality and return the boolean result.
Expand Down