Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coord points and bounds inheritance. #2553

Merged
merged 4 commits into from
May 17, 2017

Conversation

pp-mo
Copy link
Member

@pp-mo pp-mo commented May 15, 2017

This refactors the inheritance organisation of the points and bounds property code amongst Coord / AuxCoord / DimCoord.

At present, DimCoord and AuxCoord both provide their own independent implementations of points+bounds getters+setters, but there is quite a lot of duplication in that code.
This refactor removes the code duplication, by ...

  • most of the relevant code is relocated by "pushing it up" from AuxCoord into Coord.
  • AuxCoord then becomes an almost trivial descendant of Coord,
  • DimCoord is now a 'specialised' form that inherits + extends those implementations, adding realisation and DimCoord-specific validity testing.
  • the 'from_coord' routine has also been generalised in a similar way.

I think this structure makes more sense than the existing.
The knock-on effects are extremely minor : just that some error cases produce a slightly different message.
The recently-extended testing in iris.tests.unit.coords encourages me that nothing serious has been broken (!)

@DPeterK DPeterK self-requested a review May 15, 2017 16:06
@DPeterK DPeterK self-assigned this May 15, 2017
@QuLogic QuLogic added this to the dask milestone May 15, 2017
Copy link
Member

@DPeterK DPeterK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pp-mo this looks good but there are a few bits that I'm concerned about.

self._bounds_dm = DataManager(bounds)
else:
self._bounds_dm.data = bounds

# This is necessary for merging, but probably shouldn't be used otherwise.
# See #962 and #1772.
def __hash__(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's out of the current extent of this PR's changes, but could you (/did you) look into whether we still really need this? If we don't we could practically get rid of AuxCoord entirely.

Copy link
Member Author

@pp-mo pp-mo May 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into this.
The same code is present in Cube.__hash__ (and a couple of others).

In Python3, the hash operation is actually disabled for user classes that define __eq__, which means you must provide __hash__ to allow, for example, collecting the objects into sets, or using them as dictionary keys.
See: #1772

So it seems we do need this, as otherwise we can't collect Coords into sets etc._

The problem:
This violates a general principle that
"objects which compare equal have the same hash value".
See: https://docs.python.org/2.7/reference/datamodel.html#object.__hash__

So, we should really fix these,
but it probably means changing other areas such as merge, and possibly concatenate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S. I see that the exact same code appears in DimCoord, so it's another thing that can be moved into Coord. I've now done that.


if self._points_dm is not None:
# Fetch the new core array, as the parent can change it.
points = self._points_dm._real_array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not at all a fan of accessing this private DataManager method. Surely there is a better way of doing this?


if self._bounds_dm is not None:
# Fetch the new core array, as the parent can change it.
bounds = self._bounds_dm._real_array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, accessing this private method is a bad idea.

self._bounds_dm = None
else:
def _bounds_setter(self, bounds):
if bounds is not None:
# Ensure we have a realised array of new bounds values.
copy = is_lazy_data(bounds)
bounds = as_concrete_data(bounds)
bounds = np.array(bounds, copy=copy, ndmin=2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You no longer need to set ndmin here (it'll be done by _sanitise_points in the parent).

@lbdreyer lbdreyer added the dask label May 16, 2017
@pp-mo pp-mo mentioned this pull request May 17, 2017
Copy link
Member

@DPeterK DPeterK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pp-mo just a few more queries.

@@ -684,6 +766,13 @@ def _as_defn(self):
self.units, self.attributes, self.coord_system)
return defn

# Must supply __hash__ as Python 3 does not enable it if __eq__ is defined.
# NOTE: Violates "objects which compare equal must have the same hash".
# Currently needed, but not really correct and should be fixed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add "for merging" to this comment because otherwise it might not be clear why it is needed (that is, what's using it).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in fact I tested what happens when you error its usage, and the resulting errors are not about merge after all.
See : #1772 (comment)

? I could say e.g. "needed so we can use Coords as set members and dictionary keys" ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New commit coming that I hope makes this clearer...

# Make the array read-only.
self._points_dm.data.flags.writeable = False
# Make the array read-only.
points.flags.writeable = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will update the underlying array (i.e. self._points_dm.core_data()) as well, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we made sure we are dealing with our own view, as stated just above (L1661).

lib/iris/cube.py Outdated
# This is necessary for merging, but probably shouldn't be used otherwise.
# Must supply __hash__ as Python 3 does not enable it if __eq__ is defined.
# NOTE: Violates "objects which compare equal must have the same hash".
# Currently needed, but not really correct and should be fixed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, you could specify where it is needed.

# "If a class that overrides __eq__() needs to retain the
# implementation of __hash__() from a parent class, the interpreter
# must be told this explicitly".
__hash__ = Coord.__hash__
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could use "super" here, if you think it would be preferable (it avoids saying who your parent is).
This way is more efficient though 😉

@DPeterK DPeterK merged commit def2753 into SciTools:dask May 17, 2017
@pp-mo pp-mo deleted the coords_inherit_refactor branch May 18, 2017 08:07
bjlittle pushed a commit to bjlittle/iris that referenced this pull request May 31, 2017
* Refactor Coord inheritance.

* Review changes.

* Review change -- reword comments for clarity.

* Fix Python3-specific problem.
@QuLogic QuLogic modified the milestones: dask, v2.0 Aug 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants