ENH: Remove deepcopies when slicing cubes and copying coords #37

cpelley · 2017-04-24T16:19:16Z

Replacement to SciTools#2261, which points to the dask branch.

When indexing a cube when the data is reaslised (not lazy), a view of the original array is returned where possible (subject to the rules when slicing in numpy).
When indexing a cube when the data is not reaslised (lazy), realising the data on one will still not realise the data on the other.
Optimisation of coord copy when replacing the points is to shallow copy the points and bounds before replacing them to avoid unnecessary copies.
Existing behaviour is that slicing coordinates returns views of the original points and bounds (where possible). This was likely chosen behaviour on the basis that DimCoords at least is not writeable. This is not the same however for Auxiliary coordinates and likely raises the likely case for this being a bug (i.e. one can modify AuxCoord object points and bounds).
DimCoord slicing will now return views of its data like AuxCoords. DimCoords will continue to realise data unlike AuxCoords due to the validation necessary for being monotonically increasing.

cpelley · 2017-04-24T16:20:28Z

@marqh
I have pulled in all the functional changes from SciTools#2261 (such that they work with dask). This includes the new unittests. I have not bothered converting the existing iris unittests at this stage.

cpelley · 2017-04-24T16:23:14Z

The current behaviour changes the default behaviour so that indexing returns views always (i.e. switch on/off of this behaviour). That is, here is the API proposed for returning a view or a copy:

view_cube = cube[slice]                             # CUBE/COORD VIEW
copy_cube = cube[slice].copy()                      # CUBE/COORD COPY

cpelley · 2017-04-24T16:28:16Z

If we don't the ability to switch behaviours and or not change the default behaviour then here are some options:

Setting an attribute: (akin to numpy.ndarray.flags)

copy_cube = cube[slice]                             # CUBE/COORD COPY
cube.flags.share_data = True
view_cube = cube[slice]                             # CUBE/COORD VIEW

Retuning a cube that is shareable by some function/method: (perhaps returning a sub-class of Cube)

copy_cube = cube[slice]                             # CUBE/COORD COPY
new_cube = cube.as_shareable()      or     new_cube = util.as_shareable(cube)
view_cube = new_cube[slice]                         # CUBE/COORD VIEW

cpelley · 2017-04-24T16:36:37Z

My preference would be for the cleaner approach of changing the behaviour with no switch if we can agree to do this (with obvious changes seen by the user where they assume a copy on slicing).
However, if utilising a switch-able approach, the flag approach looks like the most versatile.
Returning a sub-class assumes hierarchy and one can imagine easily having to break this assumed hierarchy when considering more than one state of in future.

cpelley · 2017-04-25T10:53:48Z

ping @pp-mo

ENH: Remove deepcopies when slicing cubes and copying coords

97aad87

cpelley force-pushed the DASK_AVOID_DATA_COPIES branch from 75e3aec to 97aad87 Compare May 12, 2017 10:58

cpelley pushed a commit that referenced this pull request Jun 16, 2017

Don't make lazy wrappers for cube shape and dtype. (#37)

051c81d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Remove deepcopies when slicing cubes and copying coords #37

ENH: Remove deepcopies when slicing cubes and copying coords #37

cpelley commented Apr 24, 2017

cpelley commented Apr 24, 2017

cpelley commented Apr 24, 2017 •

edited

Loading

cpelley commented Apr 24, 2017 •

edited

Loading

cpelley commented Apr 24, 2017

cpelley commented Apr 25, 2017

ENH: Remove deepcopies when slicing cubes and copying coords #37

Are you sure you want to change the base?

ENH: Remove deepcopies when slicing cubes and copying coords #37

Conversation

cpelley commented Apr 24, 2017

cpelley commented Apr 24, 2017

cpelley commented Apr 24, 2017 • edited Loading

cpelley commented Apr 24, 2017 • edited Loading

cpelley commented Apr 24, 2017

cpelley commented Apr 25, 2017

cpelley commented Apr 24, 2017 •

edited

Loading

cpelley commented Apr 24, 2017 •

edited

Loading