Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

precompute finite value mask for kdtree query #2201

Merged
merged 9 commits into from
Jul 14, 2023

Conversation

chrishavlin
Copy link
Contributor

Rationale

scipy>=1.11 now raises an error when querying a kdtree object with coordinate arrays that contain nan or inf values. This PR adds a mask for finite values before that query occurs.

Implications

Closes #2199

@@ -268,16 +268,24 @@ def regrid(array, source_x_coords, source_y_coords, source_proj, target_proj,
target_x_points.flatten(),
target_y_points.flatten())

# Find mask of valid points before querying kdtree: scipy >= 1.11 errors
# when querying nan points, might as well use for pykdtree too.
indices = np.zeros(target_xyz.shape[0], dtype=int)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was I right in having the indices array default to 0? I based this on the indices[mask]=0 line after this, wasn't entirely certain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine to use 0 here as you noted because I think it is just settting it to some valid index. But, I think we still need to add these non-finite locations you found here into the mask below as well? Something like mask = ~finite_xyz | (indices >= len(xyz))

@chrishavlin
Copy link
Contributor Author

chrishavlin commented Jul 3, 2023

@greglucas I think the actual code changes are ready for (final?) review.

The main outstanding question I have is how to handle tests. To actually test these changes with scipy, I'd need to refactor how the scipy/pykdtree choice happens. I'm happy to do that -- my initial inclination is to add a function to set which package to use (defaulting to pykdtree) so that there is a way to toggle between the two in the test environment. Is it OK to add that to this PR or would it be better as a followup PR?

Edit: it's also hard for me to tell if these test failures are from this PR, so any tips on checking that would be useful...

@rcomer
Copy link
Member

rcomer commented Jul 5, 2023

I may have misunderstood but, for testing, could you monkeypatch img_transform._is_pykdtree to False?

@chrishavlin
Copy link
Contributor Author

I may have misunderstood but, for testing, could you monkeypatch img_transform._is_pykdtree to False?

not exactly -- because the scipy kdtree import only happens on initial module import. So if you monkeypatch _is_pykdtree to False, you get an error cause the scipy kdtree hasn't been imported yet.

@chrishavlin
Copy link
Contributor Author

One option would be to store the kdtree function handle in a variable to allow monkey patching during tests.

In pseudo-code, I was thinking along the lines of:

def _get_kdtree_function(use_pykdtree = True):
    if use_pykdtree and pykdtree is available:
         return True, pykdtree.kdtree.KDTree
    elif use_pykdtree is False and scipy is available:
        return False, scipy.spatial.cKDTree
    else:
          error 

_is_pydktree, _kdtree_func = _get_kdtree_function()

and then using that _kdtree_func handle down in regrid. That would allow you to monkeypatch both the boolean and the function handle in the test environment.

@chrishavlin
Copy link
Contributor Author

chrishavlin commented Jul 7, 2023

OK, found a simple-ish solution. Realized that if the scipy and pykdtree kdtree object handles were imported into a module attribute similar to _is_pykdtree, the kdtree object could be monkeypatched in a simple way during testing (in addition to the the _is_pykdtree flag). I updated the new test to switch between using scipy and pykdtree.

So I don't see anything else to add here at the moment, let me know if anything more is needed (or if there's a better approach to testing both scipy and pykdtree here).

And from what I can tell, the failing tests look to be the same tests that are failing in other PRs, so I don't think this PR is causing any new failures.

@chrishavlin
Copy link
Contributor Author

oops, forgot to setup pre-commit locally... will fix that in a moment.

@chrishavlin
Copy link
Contributor Author

pre-commit.ci autofix

Copy link
Contributor

@greglucas greglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good to go from a code perspective, I've just made a few minor nits that you can take or leave.

Comment on lines 132 to 136
_ = img_trans.regrid(data, lons, lats, data_trans, target_prj,
target_x, target_y)
else:
_ = img_trans.regrid(data, lons, lats, data_trans, target_prj,
target_x, target_y)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you dedent this and only keep one of them and only do the monkeypatching in the if-block, but the regridding outside?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Will do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

@@ -13,11 +13,11 @@


try:
import pykdtree.kdtree
from pykdtree.kdtree import KDTree as _kdtree_handle
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from pykdtree.kdtree import KDTree as _kdtree_handle
from pykdtree.kdtree import KDTree as _kdtreeClass

Thoughts on making this explicit that it is a Class you're using below? I wasn't entirely clear what handle meant at first glance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also done!

lib/cartopy/img_transform.py Show resolved Hide resolved
@chrishavlin
Copy link
Contributor Author

chrishavlin commented Jul 12, 2023

Thanks for the review! Will address the changes in the next few days. (edit: done!)

@chrishavlin
Copy link
Contributor Author

Totally OK with keeping this open til #2209 is merged then merging here to make extra sure that I'm not actually breaking any tests here.

@greglucas greglucas merged commit 41880a8 into SciTools:main Jul 14, 2023
@greglucas
Copy link
Contributor

Thank you @chrishavlin! This was a very nice fix.

@greglucas greglucas added this to the 0.22 milestone Jul 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: incompatibility with scipy 1.11.0
3 participants