Populate cube.fill_value thru cube.data setter. #2524

bjlittle · 2017-05-04T08:49:55Z

This PR is a follow-on action from #2492 and detailed in #2516 and allows the cube.fill_value to be set from the fill_value of concrete masked data being passed to cube.data setter.

All of the CML changes are due to either tests directly setting cube.data with a masked array, or tests that use testing stock cubes that set cube.data directly with a masked array ... thus populating the cube.fill_value and thus bleeding thru to the CML. This is the behaviour that we want to see.

If data passed to the cube.data setter is not masked data, then the cube.fill_value will be clear, as before.

This PR also neatly completes the circle with fill_value and masked arrays wrt the cube, in that the cube.data getter will set the fill_value of any masked array returned to the user with the cube.fill_value, and now this PR allows the fill_value of a masked array to set the cube.fill_value through the cube.data setter.

bjlittle · 2017-05-04T09:33:08Z

Ping @dkillick and @lbdreyer ...

lbdreyer · 2017-05-04T11:52:46Z

If you create a new masked array with a specific fill value, make it lazy, then make it a cube's data
cube.data = lazy_array_that_was_originally_masked_with_a_fill_value
the fill value will get lost, but I guess there's nothing we can do about that and probably not something we need to account for

lbdreyer · 2017-05-04T12:18:16Z

This looks good to me 👍 So unless there are any objections I'm going to merge this in!

lbdreyer · 2017-05-04T12:21:21Z

lib/iris/tests/unit/cube/test_Cube.py

+        cube = Cube(np.array(0))
+        self.assertIsNone(cube.fill_value)
+        cube.data = ma.masked_array(1, fill_value=fill_value)
+        self.assertEqual(cube.fill_value, fill_value)


This doesn't check the data type of the fill value is correct but presumably the testing of that will be in the cube.fill_value setter

My assumption is that unit testing elsewhere covers that, and it does 😉

bjlittle · 2017-05-04T12:33:11Z

@lbdreyer Awesome, thanks!

lbdreyer · 2017-05-04T16:42:49Z

I'm having second thoughts on this PR...

It has introduced a disparity between cube.fill_value behaviour depending on whether you instantiate a cube with a masked array or add it to a cube already there, i.e.:

>>> data = np.array([1, 2, 3])
>>> masked_data = ma.MaskedArray([1, 2, 3], mask=[True, False, False])
>>> 
>>> cube = Cube(masked_data)
>>> print cube.fill_value
None
>>> 
>>> cube = Cube(data)
>>> print cube.fill_value
None
>>> cube.data = masked_data
>>> print cube.fill_value
999999

I also noticed that it introduce some strange behaviour with one of our test files which has a dtype of unint8

>>> cube = iris.load_cube('iris-test-data/test_data/abfAVHRRBUVI01.1985apra.abf')
>>> print type(cube.data)
<class 'numpy.ma.core.MaskedArray'>
>>> print cube.fill_value
None
>>> cube.data = cube.data.copy(order='C')
>>> print cube.fill_value
63

This is because, regardless of what type of integer the masked array's dtype is, the fill_value of an integer masked array will always be 999999 (:warning:)
The cube's data in the above example is a masked array with a dtype of uint8 so the cube.fill_value setter is changing the default numpy fill value to be 63:

>>> np.array([999999], dtype=np.uint8)
array([63], dtype=uint8)

bjlittle · 2017-05-05T13:21:32Z

@lbdreyer Nice detective work! Let's discuss a way forward off-line and commit to a follow-on PR 👍

Populate cube.fill_value thru cube.data setter.

2c7d0b5

bjlittle added the dask label May 4, 2017

bjlittle added this to the dask milestone May 4, 2017

bjlittle added the Status: Work in Progress label May 4, 2017

bjlittle mentioned this pull request May 4, 2017

Outstanding - Integrate DataManager with cube #2516

Closed

5 tasks

lbdreyer self-assigned this May 4, 2017

lbdreyer approved these changes May 4, 2017

View reviewed changes

lbdreyer merged commit fe99864 into SciTools:dask May 4, 2017

QuLogic removed the Status: Work in Progress label May 4, 2017

bjlittle deleted the dask-assign-fill-value-thru-cube-data branch May 4, 2017 12:34

lbdreyer mentioned this pull request May 4, 2017

Get all Iris tests working again (5 man days' effort) #2398

Closed

bjlittle mentioned this pull request May 10, 2017

Follow-on fix for fill_value #2540

Closed

bjlittle mentioned this pull request May 23, 2017

Dask data manager propagate #2548

Merged

QuLogic modified the milestones: dask, v2.0 Aug 2, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Populate cube.fill_value thru cube.data setter. #2524

Populate cube.fill_value thru cube.data setter. #2524

bjlittle commented May 4, 2017 •

edited

Loading

bjlittle commented May 4, 2017

lbdreyer commented May 4, 2017

lbdreyer commented May 4, 2017 •

edited

Loading

lbdreyer May 4, 2017

bjlittle May 4, 2017 •

edited

Loading

bjlittle commented May 4, 2017

lbdreyer commented May 4, 2017 •

edited by DPeterK

Loading

bjlittle commented May 5, 2017

Populate cube.fill_value thru cube.data setter. #2524

Populate cube.fill_value thru cube.data setter. #2524

Conversation

bjlittle commented May 4, 2017 • edited Loading

bjlittle commented May 4, 2017

lbdreyer commented May 4, 2017

lbdreyer commented May 4, 2017 • edited Loading

lbdreyer May 4, 2017

Choose a reason for hiding this comment

bjlittle May 4, 2017 • edited Loading

Choose a reason for hiding this comment

bjlittle commented May 4, 2017

lbdreyer commented May 4, 2017 • edited by DPeterK Loading

bjlittle commented May 5, 2017

bjlittle commented May 4, 2017 •

edited

Loading

lbdreyer commented May 4, 2017 •

edited

Loading

bjlittle May 4, 2017 •

edited

Loading

lbdreyer commented May 4, 2017 •

edited by DPeterK

Loading