-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data first #2392
Data first #2392
Conversation
lib/iris/fileformats/netcdf.py
Outdated
data = iris._lazy_data.as_lazy_data(proxy) | ||
cube = iris.cube.Cube(data) | ||
data = da.from_array(proxy, chunks=100) | ||
cube = iris.cube.Cube(data, fill_value=fill_value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this pass the data type in as well?
lib/iris/fileformats/rules.py
Outdated
cube = iris.cube.Cube(data, | ||
attributes=metadata.attributes, | ||
cell_methods=metadata.cell_methods, | ||
dim_coords_and_dims=metadata.dim_coords_and_dims, | ||
aux_coords_and_dims=metadata.aux_coords_and_dims) | ||
aux_coords_and_dims=metadata.aux_coords_and_dims, | ||
fill_value=field.bmdi) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this also set the data type
lib/iris/fileformats/rules.py
Outdated
@@ -901,15 +902,15 @@ def _make_cube(field, converter): | |||
metadata = converter(field) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please re-implement this, don't try/accept
lib/iris/cube.py
Outdated
@@ -1631,9 +1636,10 @@ def lazy_data(self, array=None): | |||
A lazy array, representing the Cube data array. | |||
|
|||
""" | |||
result = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't allow this, if there is not _numpy_array and no _dask_array raise an exception
self.create_lbpack(lbpack), | ||
None, | ||
mask.shape, np.dtype('>f4'), | ||
-999, mask=mask) | ||
return np.ma.masked_array(data, np.isnan(data), fill_value=-999) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.ma
-> ma
@@ -34,7 +34,7 @@ def test_masked(self): | |||
masked_array = np.ma.masked_array([[1.0, 2.0], [3.0, 4.0]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.ma
-> ma
@@ -34,7 +34,7 @@ def test_masked(self): | |||
masked_array = np.ma.masked_array([[1.0, 2.0], [3.0, 4.0]], | |||
mask=[[0, 1], [0, 0]]) | |||
|
|||
result = array_masked_to_nans(masked_array) | |||
result = array_masked_to_nans(masked_array).data | |||
|
|||
self.assertIsInstance(result, np.ndarray) | |||
self.assertFalse(isinstance(result, np.ma.MaskedArray)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.ma
-> ma
@@ -48,7 +48,7 @@ def test_masked(self): | |||
def test_empty_mask(self): | |||
masked_array = np.ma.masked_array([1.0, 2.0], mask=[0, 0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.ma
-> ma
@@ -48,7 +48,7 @@ def test_masked(self): | |||
def test_empty_mask(self): | |||
masked_array = np.ma.masked_array([1.0, 2.0], mask=[0, 0]) | |||
|
|||
result = array_masked_to_nans(masked_array) | |||
result = array_masked_to_nans(masked_array).data | |||
|
|||
self.assertIsInstance(result, np.ndarray) | |||
self.assertFalse(isinstance(result, np.ma.MaskedArray)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.ma
-> ma
lib/iris/_merge.py
Outdated
import iris.util | ||
from iris._lazy_data import is_lazy_data, array_masked_to_nans |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marqh Minor point but from iris._lazy ...
sorts before alphabetical, and should be the first iris
import.
lib/iris/cube.py
Outdated
import iris._merge | ||
import iris.exceptions | ||
import iris.util | ||
from iris._lazy_data import is_lazy_data, array_masked_to_nans |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marqh the import order is wrong
|
||
def has_lazy_data(self): | ||
return is_lazy_data(self._my_data) | ||
return True if self._numpy_array is None else False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marqh I'd prefer
return self._numpy_array is None
Also, isn't it the case the both cube._numpy_array
and cube._dask_array
could be populated? If so, are we saying that if that's the case, the cube isn't lazy?
Is there an explicit statement to define whether it's valid for both of these attributes to be populated? Sutely, it's totally invalid for both not to be populated ... do we have checks and balances for that to maintain that integrity within the cube?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, isn't it the case the both cube._numpy_array and cube._dask_array could be populated?
yes, at present this is the case
If so, are we saying that if that's the case, the cube isn't lazy?
yes, it is not lazy
ifcube._numpy_array
exists it is always used in preference
Is there an explicit statement to define whether it's valid for both of these attributes to be populated?
No, there is no explicit statement. It is an open question.
We should answer it. The question here is should we answer it on this PR or raise an issue and answer it?
The utility of having both is thecube._numpy_array
may be set to None, reverting the cube back to it's initally loaded state, with a file proxy. I don't know how valuable this may be.
Sutely, it's totally invalid for both not to be populated ...
Yes, that is invalid and exceptions are thrown
do we have checks and balances for that to maintain that integrity within the cube?
there are some, are there enough?
ok, further review actions pushed. hopefully i've maintained test success; we shall see please squash and merge at will once travis is content |
@marqh Agreed. If the tests pass, then let's bank this PR. It's a feature branch and we need to move forwards 👍 |
Ping @marqh you still have a couple of tests referencing the now nuked |
doh |
@bjlittle |
Woop! 🎉 🍰 🍷 |
Core refactor for dask usage
@@ -47,6 +47,8 @@ def _test_coord(self, cube, point, bounds=None, **kwargs): | |||
if bounds is not None: | |||
self.assertArrayEqual(coords[0].bounds, [bounds]) | |||
|
|||
# hits a segfault, very odd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hits a segfault
I think these may be fixed by : #2395
Short version: since we changed our test for lazy data, a simple Mock tests as lazy instead of real.
Then np.ma.asarray()
can hit an endless loop (whereas np.asarray
was ok).
Core refactor for dask usage
# The proxy supplies nan filled arrays and caches data. | ||
data = self._data[...] | ||
if data.dtype.kind == 'i' and self.bmdi == -1e30: | ||
self.bmdi = -9999 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Magic number ?
Core refactor for dask usage
for discussion regarding how to implement this interface