-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix geostationary bbox having inf values #493
Conversation
Codecov Report
@@ Coverage Diff @@
## main #493 +/- ##
=======================================
Coverage 94.32% 94.33%
=======================================
Files 74 74
Lines 12889 12947 +58
=======================================
+ Hits 12158 12214 +56
- Misses 731 733 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
@ghiggi I would just like to confirm with you that this is the correct fix. I looks like that for pyresample <=1.25.0, having |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spherical polygon should not receive np.nan or np.inf values.
This was occurring because of past bugs related to inplace modifications etc.
Here I would rather ask myself why Proj returns inf values ...
pyresample/geometry.py
Outdated
@@ -2818,6 +2818,8 @@ def get_geostationary_bounding_box_in_lonlats(geos_area, nb_points=50): | |||
""" | |||
x, y = get_geostationary_bounding_box_in_proj_coords(geos_area, nb_points) | |||
lons, lats = Proj(geos_area.crs)(x, y, inverse=True) | |||
lons[np.isinf(lons)] = np.nan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why here we have inf returned by Proj? I remember this function to return the coordinates of the GEO field of view (i.e the circle for FD) ... I would rather drop Inf/Nan values here !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason some of the pixels must end up in space, in which case PROJ returns an infinite value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the whole point of the get_geostationary_bounding_box_in_proj_coords
method so that it doesn't return space pixels? Is there a fix we can make to that function to clean this up? Any idea how far the (x, y)
coordinates producing inf
are from not producing an inf
? I mean, how close are they to the on-earth disk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should also say, I don't know if I've tested with the most recent code but I've definitely had cases where polygons with NaNs in the coordinates produced either NaN
.area
results or caused infinite loops when trying to do intersections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to have a look tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hhhmm the y
values are definitely not right for a full disk geos area (if that's what this is). The second half are all the same y-position:
In [7]: x
Out[7]:
array([-5.43062255e+06, -5.16482897e+06, -4.39346593e+06, -3.19203985e+06,
-1.67815466e+06, 3.32529726e-10, 1.67815466e+06, 3.19203985e+06,
4.39346593e+06, 5.16482897e+06, 5.43062255e+06, 5.16482897e+06,
4.39346593e+06, 3.19203985e+06, 1.67815466e+06, 3.32529726e-10,
-1.67815466e+06, -3.19203985e+06, -4.39346593e+06, -5.16482897e+06,
-5.43062255e+06])
In [8]: y
Out[8]:
array([1393687.2705 , 1672427.79006384, 3181146.69554664,
4378472.79811701, 5147203.47659387, 5412090.01610633,
5147203.47659387, 4378472.79811701, 3181146.69554663,
1672427.79006384, 1393687.2705 , 1393687.2705 ,
1393687.2705 , 1393687.2705 , 1393687.2705 ,
1393687.2705 , 1393687.2705 , 1393687.2705 ,
1393687.2705 , 1393687.2705 , 1393687.2705 ])
Edit: Ah, this is the top half of the SEVIRI image isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I have to give up. I'm lost in the trigonometry of get_geostationary_angle_extent
(where is the 1 -
coming from?). I've determined that the x coordinate needs to be moved over 178096 meters before it produces a valid lon/lat. However, if I go the other way and convert lon/lat to x/y, I can do p(-71.48, 14.83764943645431)
and get -5252536.339579278
which is about 10 meters further west than I can go in the x/y -> lon/lat direction. Either way we're like 178km from the edge of the projection for some reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if I can't have a look ... but I am overwhelmed by other tasks :-(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hhhmm the
y
values are definitely not right for a full disk geos area (if that's what this is). The second half are all the same y-position:Edit: Ah, this is the top half of the SEVIRI image isn't it?
yes, so that's normal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I found the problem. Working on a fix.
Have you tried if this by any chance also fixes #393? |
I did and it doesn't unfortunately. |
Alright, I think this is ready for re-review @djhoese @ghiggi. The problem was quite simple, but the solution is not as easy as I would have liked. |
pyresample/geometry.py
Outdated
load_cf_area, | ||
proj4_dict_to_str, | ||
proj4_radius_parameters, | ||
) | ||
from pyresample.utils.proj4 import get_geostationary_height |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, why the import change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's more explicit this way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but maybe less consistent? Maybe all the utils
should be explicit and we gret of the convenience import in utils/__init__.py
?
from shapely.geometry import Polygon | ||
geo_bbox = Polygon(np.vstack((x, y)).T) | ||
area_bbox = Polygon(((ll_x, ll_y), (ll_x, ur_y), (ur_x, ur_y), (ur_x, ll_y))) | ||
intersection = area_bbox.intersection(geo_bbox) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a very good chance I'm missing something, but could this be simplified (logically, maybe not in code) to min/max each of the coordinates in the full geostationary bounding box with the extents of the area? So if the full bbox says a point is at (10_000, -85_000_000)
and the extents of the area are (-10_000, -60_000_000, 50_000, 10_000_000)
then we know that the full bbox point's x
coordinate is within the extent, but the y
coordinate can be clipped to -60_000_000
.
Not sure this is accurate, or makes sense, or is a good idea, but it is the first thing that comes to mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's basically what we had before, in some way. The problem is to get the corner points of the intersection between the disc and the truncation, especially if only one corner of the truncated area is in space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok so you're saying if the disk polygon was computed and has points partially in space that clipping to the extents of the area wouldn't effect some of those points? How does this current intersection do that?
I thought the problem in main
was that the calculation of the full disk polygon coordinates were including points outside the full disk? I'll try reading the code to see if I can understand it more.
In your comment above you said:
The problem appeared when we just had an area describing a portion of the earth that did not cover the equator. In that case, the simple clipping we did to "cut" the disk was giving (unneeded) points outside the valid disk.
When you say "cut", do you mean clipping the disk coordinates to the extent of the area or do you mean how the coordinates were calculated for half the disk and then replicated for the other half?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, so the polygon for the full disk looks like this:
Now, if we just clip y at 2e6 like it is in main, we get this:
If you look carefully, you see the last point on each side of the y=2e6 line ends up in space, because we 'project' the extreme points of the equator onto a latitude where the sphere is not as wide.
So how do we avoid this?
One solution would be to not compute everything below y=2e6 and find the edge of the disk at that y position to close the polygon. It's not too hard, but there are many different case, whether the disk is cut horizontally, vertically, or both, includes the equator or prime meridian, or not.
A simpler geometrical approach in my opinion is just to find the intersection of the full disk polygon with the rectangle of the area extent. Since they are all in the same projection, we can see that as a simple 2d geometry problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think I got it. One more alternative solution: mask out (remove) any points not within the bbox. The main point here is that our extent bbox is not a complicated polygon so we only have two limits to check:
disk_points = ...
area_points = disk_points[
(disk_points[:, 0] >= min(extent[0], extent[2])) &
(disk_points[:, 0] <= max(extent[0], extent[2])) &
(disk_points[:, 1] >= min(extent[1], extent[3])) &
(disk_points[:, 1] <= min(extent[1], extent[3]))]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only other thing, which is at this point getting ridiculous and we should just use shapely, would be to combine my two suggestions and every outer disk pixel gets clipped to the min/max of the inner disk pixels (the pixels inside the area's bbox).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind again, this last suggestion doesn't fix the issue. You still end up with the points being no closer to the extent of the area. Ok, shapely it is. Maybe at this point we say shapely is required? What is it being used for now? Did @ghiggi want to depend on it more heavily in the future for spherical stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I plan to use it for several tasks. Could make sense to have shapely as a requirement.
Just check if we need to constrain to shapely < 2.0.0 or we ask for shapely > 2.0.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe at this point we say shapely is required? What is it being used for now?
Gradient search resampler is using shapely
to find the overlapping source and target area chunk pairs so that only those are handled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made shapely a hard requirement
Looks like the wheel building is failing on 32-bit linux because it has to build shapely 2.x since shapely doesn't provide 32-bit wheels. This says to me we should get rid of 32-bit. I think we're already requiring ourselves to build numpy on 32-bit and if the maintainers of these packages aren't going to go through the work of building on 32-bit I don't think we can expect our selves to do it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except for a couple questions/comments, looks good to me. Let's see what breaks. Feel free to merge when you want.
@@ -599,7 +599,7 @@ def test_check_overlap(): | |||
assert check_overlap(poly1, poly2) is False | |||
|
|||
|
|||
@mock.patch('pyresample.gradient.get_geostationary_bounding_box') | |||
@mock.patch('pyresample.gradient.get_geostationary_bounding_box_in_lonlats') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like how all these old tests mock the functions being called. I know this isn't what this PR is about, but mocking the bounding box function called by the border function in this test, what's being tested then? The same for the test below it, lots of mocks. In the future it'd be nice if we could construct an area definition that just produced the results we wanted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Let's aim to keep mocking at a minimum.
Notes: | ||
- The first and last element of the output vectors are equal. | ||
- If nb_points is even, it will return x and y vectors of length nb_points + 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a problem for any of the other interfaces (the bounds and polygon stuff) that the end points are not the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests didn't break, so I think we're good. Also I always thought that was the way we did it in pyresample.
I removed the 32 bit build now. |
As described in #492 , this PR fixes problems in geostationary bbox computation
git diff origin/main **/*py | flake8 --diff