Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outstanding - Integrate DataManager with cube #2516

Closed
5 tasks done
bjlittle opened this issue Apr 27, 2017 · 8 comments
Closed
5 tasks done

Outstanding - Integrate DataManager with cube #2516

bjlittle opened this issue Apr 27, 2017 · 8 comments
Assignees
Milestone

Comments

@bjlittle
Copy link
Member

bjlittle commented Apr 27, 2017

Ref - #2492

The following outstanding issues still require to be investigated and/or addressed:

@bjlittle bjlittle added the dask label Apr 27, 2017
@bjlittle bjlittle added this to the dask milestone Apr 27, 2017
@bjlittle bjlittle self-assigned this Apr 28, 2017
@pp-mo
Copy link
Member

pp-mo commented Apr 28, 2017

need to adopt the concept of a cube with an empty fill_value of None matching a candidate cube with an existing fill_value - although two candidate cubes with difference non-None fill_values should never match (obviously)

I disagree with this.
I think, if you are including data with "no known suitable fill value", the result has ... "no known suitable fill value".
For instance,

>>> np.stack(
...     [ma.array([0, 1, 2], fill_value=-1),
...      ma.array([-1, -2, -3])])
masked_array(data =
 [[ 0  1  2]
 [-1 -2 -3]],
             mask =
 False,
       fill_value = 999999)

>>> 

That's equivalent to a merge. In this case, "-1" is not a suitable fill value for the result, as it has a valid -1 point in it.

@djkirkham
Copy link
Contributor

djkirkham commented May 2, 2017

Cube merging now (once #2492 is merged) allows cubes to merge if one of them has a fill value and the other doesn't. But this causes problems given a list where one of the cubes has no fill value and the others have differing fill values: the first cube matched with the None-fill-value cube will be merged with it, and subsequent cubes won't, making merging dependent on the order of the cubes in the list.

I'm not sure what the correct behaviour should be. If we want to keep fill value promotion, then it seems like we need logic along the lines of

For any cubes A, B and C in a list, if A can merge with each of B and C, then merging should fail unless B can merge with C.

In terms of fill values, that amounts to

For any cubes A, B and C in a list, if A has a fill value of None, then merging should fail unless B and C have the same fill value or one of them has a flll value of None.

The problem is that merging assumes the match method of _CubeSignature is transitive, which is no longer the case. With this assumption, the first cube in the list can be used as a reference for other cubes to match against without requiring comparisons with the remaining cubes.

Actually, the more I think about it the more it seems the best approach is to not allow fill_value promotion, at least not at this point. I think allowing it requires a significant amount of work which we don't have time for. But that's just my opinion.

@djkirkham
Copy link
Contributor

Cubes with lazy integer masked data will keep their integer dtype when operations are performed on them which should result in a change of dtype (e.g. log). For example:

from numpy import ma
from iris.cube import Cube
from iris.analysis.maths import log
from iris._lazy_data import as_lazy_data
a = ma.masked_array([10,20], [0,1])
la = as_lazy_data(a)
cube = Cube(la, dtype=int)
cube = log(cube)
print cube.dtype
print cube.data

yields:

int64
[2 --]

@bjlittle
Copy link
Member Author

bjlittle commented May 4, 2017

@djkirkham Can you create a separate issue for this or are you already addressing this somewhere?

@djkirkham
Copy link
Contributor

djkirkham commented May 4, 2017

@bjlittle I don't have permissions to create issues within any of the dask projects. I'm not currently working on this, but what I've added in #2526 could be used as part of the solution I think.

@bjlittle
Copy link
Member Author

bjlittle commented May 4, 2017

@djkirkham Let me check the perms ... or at least understand why.

@bjlittle
Copy link
Member Author

bjlittle commented May 4, 2017

@djkirkham But you can at least create an issue on iris, right?

And set the label and milestone to dask? ... if not, just quote the issue number as a comment here and I'll add it to the board.

I can't see how to set permissions for a user specifically for a project, which is a shame 😖

@djkirkham
Copy link
Contributor

djkirkham commented May 4, 2017

@bjlittle I can't set the milestone, but here's the issue: #2528

@bjlittle bjlittle closed this as completed May 4, 2017
@QuLogic QuLogic modified the milestones: dask, v2.0 Aug 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants