Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix geostationary bbox having inf values #493

Merged
merged 7 commits into from
Feb 7, 2023

Conversation

mraspaud
Copy link
Member

@mraspaud mraspaud commented Jan 26, 2023

As described in #492 , this PR fixes problems in geostationary bbox computation

@mraspaud mraspaud added the bug label Jan 26, 2023
@mraspaud mraspaud self-assigned this Jan 26, 2023
@mraspaud mraspaud requested a review from djhoese January 26, 2023 13:11
@codecov
Copy link

codecov bot commented Jan 26, 2023

Codecov Report

Merging #493 (b20f66c) into main (8393134) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main     #493   +/-   ##
=======================================
  Coverage   94.32%   94.33%           
=======================================
  Files          74       74           
  Lines       12889    12947   +58     
=======================================
+ Hits        12158    12214   +56     
- Misses        731      733    +2     
Flag Coverage Δ
unittests 94.33% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pyresample/geometry.py 87.46% <100.00%> (-0.05%) ⬇️
pyresample/gradient/__init__.py 95.14% <100.00%> (ø)
pyresample/test/test_geometry.py 99.55% <100.00%> (+0.01%) ⬆️
pyresample/test/test_gradient.py 98.98% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@coveralls
Copy link

coveralls commented Jan 26, 2023

Coverage Status

Coverage: 93.862% (+0.01%) from 93.85% when pulling b20f66c on mraspaud:fix-geos-bb into 8393134 on pytroll:main.

@mraspaud
Copy link
Member Author

@ghiggi I would just like to confirm with you that this is the correct fix. I looks like that for pyresample <=1.25.0, having np.inf values in spherical polygon was ok for doing intersections, while now only np.nan is working. Is it supposed to be like this, and is this documented somewhere?

Copy link
Contributor

@ghiggi ghiggi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spherical polygon should not receive np.nan or np.inf values.
This was occurring because of past bugs related to inplace modifications etc.
Here I would rather ask myself why Proj returns inf values ...

@@ -2818,6 +2818,8 @@ def get_geostationary_bounding_box_in_lonlats(geos_area, nb_points=50):
"""
x, y = get_geostationary_bounding_box_in_proj_coords(geos_area, nb_points)
lons, lats = Proj(geos_area.crs)(x, y, inverse=True)
lons[np.isinf(lons)] = np.nan
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why here we have inf returned by Proj? I remember this function to return the coordinates of the GEO field of view (i.e the circle for FD) ... I would rather drop Inf/Nan values here !

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason some of the pixels must end up in space, in which case PROJ returns an infinite value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the whole point of the get_geostationary_bounding_box_in_proj_coords method so that it doesn't return space pixels? Is there a fix we can make to that function to clean this up? Any idea how far the (x, y) coordinates producing inf are from not producing an inf? I mean, how close are they to the on-earth disk?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should also say, I don't know if I've tested with the most recent code but I've definitely had cases where polygons with NaNs in the coordinates produced either NaN .area results or caused infinite loops when trying to do intersections.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to have a look tomorrow.

Copy link
Member

@djhoese djhoese Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hhhmm the y values are definitely not right for a full disk geos area (if that's what this is). The second half are all the same y-position:

In [7]: x
Out[7]: 
array([-5.43062255e+06, -5.16482897e+06, -4.39346593e+06, -3.19203985e+06,
       -1.67815466e+06,  3.32529726e-10,  1.67815466e+06,  3.19203985e+06,
        4.39346593e+06,  5.16482897e+06,  5.43062255e+06,  5.16482897e+06,
        4.39346593e+06,  3.19203985e+06,  1.67815466e+06,  3.32529726e-10,
       -1.67815466e+06, -3.19203985e+06, -4.39346593e+06, -5.16482897e+06,
       -5.43062255e+06])

In [8]: y
Out[8]:
array([1393687.2705    , 1672427.79006384, 3181146.69554664,
       4378472.79811701, 5147203.47659387, 5412090.01610633,
       5147203.47659387, 4378472.79811701, 3181146.69554663,
       1672427.79006384, 1393687.2705    , 1393687.2705    ,
       1393687.2705    , 1393687.2705    , 1393687.2705    ,
       1393687.2705    , 1393687.2705    , 1393687.2705    ,
       1393687.2705    , 1393687.2705    , 1393687.2705    ])

Edit: Ah, this is the top half of the SEVIRI image isn't it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I have to give up. I'm lost in the trigonometry of get_geostationary_angle_extent (where is the 1 - coming from?). I've determined that the x coordinate needs to be moved over 178096 meters before it produces a valid lon/lat. However, if I go the other way and convert lon/lat to x/y, I can do p(-71.48, 14.83764943645431) and get -5252536.339579278 which is about 10 meters further west than I can go in the x/y -> lon/lat direction. Either way we're like 178km from the edge of the projection for some reason.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I can't have a look ... but I am overwhelmed by other tasks :-(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hhhmm the y values are definitely not right for a full disk geos area (if that's what this is). The second half are all the same y-position:

Edit: Ah, this is the top half of the SEVIRI image isn't it?

yes, so that's normal

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I found the problem. Working on a fix.

@gerritholl
Copy link
Collaborator

gerritholl commented Jan 26, 2023

Have you tried if this by any chance also fixes #393?

@mraspaud
Copy link
Member Author

mraspaud commented Feb 3, 2023

Have you tried if this by any chance also fixes #393?

I did and it doesn't unfortunately.

@mraspaud
Copy link
Member Author

mraspaud commented Feb 3, 2023

Alright, I think this is ready for re-review @djhoese @ghiggi.

The problem was quite simple, but the solution is not as easy as I would have liked.
Some explanation:
To get the geostationary bounding box, that is a list of coordinates around the side of the earth disk, we draw a sampled ellipse that is the size of the earth. This was working without problem for the full earth disk, and still is, I have just refactors things a little and added a test.
The problem appeared when we just had an area describing a portion of the earth that did not cover the equator. In that case, the simple clipping we did to "cut" the disk was giving (unneeded) points outside the valid disk.
To prevent this, the easiest solution I could think of was to actually take the full earth bounding box and intersect it with the rectangle defined with the area extent of the area at hand. The obvious choice to do this, since we are in projection coordinates (no poles, no date shift line) is shapely.
So I used shapely here, because I didn't want to reinvent the wheel. I hope you are ok with that.
Question is now if we make shapely a hard dependency or not...

load_cf_area,
proj4_dict_to_str,
proj4_radius_parameters,
)
from pyresample.utils.proj4 import get_geostationary_height
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why the import change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's more explicit this way

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but maybe less consistent? Maybe all the utils should be explicit and we gret of the convenience import in utils/__init__.py?

Comment on lines +2791 to +2794
from shapely.geometry import Polygon
geo_bbox = Polygon(np.vstack((x, y)).T)
area_bbox = Polygon(((ll_x, ll_y), (ll_x, ur_y), (ur_x, ur_y), (ur_x, ll_y)))
intersection = area_bbox.intersection(geo_bbox)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a very good chance I'm missing something, but could this be simplified (logically, maybe not in code) to min/max each of the coordinates in the full geostationary bounding box with the extents of the area? So if the full bbox says a point is at (10_000, -85_000_000) and the extents of the area are (-10_000, -60_000_000, 50_000, 10_000_000) then we know that the full bbox point's x coordinate is within the extent, but the y coordinate can be clipped to -60_000_000.

Not sure this is accurate, or makes sense, or is a good idea, but it is the first thing that comes to mind.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's basically what we had before, in some way. The problem is to get the corner points of the intersection between the disc and the truncation, especially if only one corner of the truncated area is in space.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so you're saying if the disk polygon was computed and has points partially in space that clipping to the extents of the area wouldn't effect some of those points? How does this current intersection do that?

I thought the problem in main was that the calculation of the full disk polygon coordinates were including points outside the full disk? I'll try reading the code to see if I can understand it more.

In your comment above you said:

The problem appeared when we just had an area describing a portion of the earth that did not cover the equator. In that case, the simple clipping we did to "cut" the disk was giving (unneeded) points outside the valid disk.

When you say "cut", do you mean clipping the disk coordinates to the extent of the area or do you mean how the coordinates were calculated for half the disk and then replicated for the other half?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so the polygon for the full disk looks like this:
full_disk

Now, if we just clip y at 2e6 like it is in main, we get this:
clipped_y

If you look carefully, you see the last point on each side of the y=2e6 line ends up in space, because we 'project' the extreme points of the equator onto a latitude where the sphere is not as wide.

So how do we avoid this?
One solution would be to not compute everything below y=2e6 and find the edge of the disk at that y position to close the polygon. It's not too hard, but there are many different case, whether the disk is cut horizontally, vertically, or both, includes the equator or prime meridian, or not.
A simpler geometrical approach in my opinion is just to find the intersection of the full disk polygon with the rectangle of the area extent. Since they are all in the same projection, we can see that as a simple 2d geometry problem.
full_and_box

Copy link
Member

@djhoese djhoese Feb 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I got it. One more alternative solution: mask out (remove) any points not within the bbox. The main point here is that our extent bbox is not a complicated polygon so we only have two limits to check:

disk_points = ...
area_points = disk_points[
    (disk_points[:, 0] >= min(extent[0], extent[2])) &
    (disk_points[:, 0] <= max(extent[0], extent[2])) &
    (disk_points[:, 1] >= min(extent[1], extent[3])) &
    (disk_points[:, 1] <= min(extent[1], extent[3]))]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only other thing, which is at this point getting ridiculous and we should just use shapely, would be to combine my two suggestions and every outer disk pixel gets clipped to the min/max of the inner disk pixels (the pixels inside the area's bbox).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind again, this last suggestion doesn't fix the issue. You still end up with the points being no closer to the extent of the area. Ok, shapely it is. Maybe at this point we say shapely is required? What is it being used for now? Did @ghiggi want to depend on it more heavily in the future for spherical stuff?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I plan to use it for several tasks. Could make sense to have shapely as a requirement.
Just check if we need to constrain to shapely < 2.0.0 or we ask for shapely > 2.0.0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe at this point we say shapely is required? What is it being used for now?

Gradient search resampler is using shapely to find the overlapping source and target area chunk pairs so that only those are handled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made shapely a hard requirement

@mraspaud mraspaud requested review from djhoese and pnuu February 6, 2023 17:00
@djhoese
Copy link
Member

djhoese commented Feb 6, 2023

Looks like the wheel building is failing on 32-bit linux because it has to build shapely 2.x since shapely doesn't provide 32-bit wheels. This says to me we should get rid of 32-bit. I think we're already requiring ourselves to build numpy on 32-bit and if the maintainers of these packages aren't going to go through the work of building on 32-bit I don't think we can expect our selves to do it.

Copy link
Member

@djhoese djhoese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for a couple questions/comments, looks good to me. Let's see what breaks. Feel free to merge when you want.

@@ -599,7 +599,7 @@ def test_check_overlap():
assert check_overlap(poly1, poly2) is False


@mock.patch('pyresample.gradient.get_geostationary_bounding_box')
@mock.patch('pyresample.gradient.get_geostationary_bounding_box_in_lonlats')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't like how all these old tests mock the functions being called. I know this isn't what this PR is about, but mocking the bounding box function called by the border function in this test, what's being tested then? The same for the test below it, lots of mocks. In the future it'd be nice if we could construct an area definition that just produced the results we wanted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Let's aim to keep mocking at a minimum.

Comment on lines -2785 to -2787
Notes:
- The first and last element of the output vectors are equal.
- If nb_points is even, it will return x and y vectors of length nb_points + 1.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a problem for any of the other interfaces (the bounds and polygon stuff) that the end points are not the same?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests didn't break, so I think we're good. Also I always thought that was the way we did it in pyresample.

@mraspaud
Copy link
Member Author

mraspaud commented Feb 7, 2023

Looks like the wheel building is failing on 32-bit linux because it has to build shapely 2.x since shapely doesn't provide 32-bit wheels. This says to me we should get rid of 32-bit. I think we're already requiring ourselves to build numpy on 32-bit and if the maintainers of these packages aren't going to go through the work of building on 32-bit I don't think we can expect our selves to do it.

I removed the 32 bit build now.

@mraspaud mraspaud merged commit 1e34790 into pytroll:main Feb 7, 2023
@mraspaud mraspaud deleted the fix-geos-bb branch February 7, 2023 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Infinite values in geostationary bounding box crash intersection function
6 participants