-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework affine transformation re-gridding (from 3D point cloud to 2.5D DEM) for accuracy and performance #531
Conversation
@erikmannerfelt @adehecq I want to check if you both agree on the way forward to finalize the In Instead, I think we should re-grid using And we would maintain the case of an horizontal shift separate for computational efficiency. The only delicate part of the new routine should be to re-create NaNs where we want to have NaNs after interpolation (any pixel that does not have a shifted data point), but I think I can figure that out. |
I totally agree that we should use an exact regridding for the |
So I looked into the issue in details over the weekend, and it looks like there is no "exact" solution for the reprojection to a regular 2D grid (for instance, rotating the DEM by 90°, so essentially turning it vertical along one axis, many possible elevation points could intersect a similar regular X/Y grid point). So it's not trivial... The longest computational aspect is the search for a nearest neighbour elevation to initialize the algorithm, but this should also be much faster than a triangulation. In the next days I will try to make a couple plots to characterize convergence/divergence for rotation amplitude + estimate the speed-up compared to I will also make a little schematic to explain the algorithm 😉 |
@erikmannerfelt @adehecq This PR is finished and ready for your review! 😄 (see updated description at the top) |
Alright, I had to fix a couple issues in the documentation + optimize the speed of some functions. Now all done and ready to merge! 😁 |
Fantastic !! Nice work ! 🙌 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to approve!
Glad you like it, thanks for the swift review! 😄 |
This PR updates
apply_matrix
to use efficient and accurate 3D affine transformation for rasters, for any transformation.It also modifies the exact affine transformation of a point cloud to use only NumPy instead of OpenCV, which brings us one step closer to not depend on OpenCV (which is a pretty heavy dependency to have). The only remaining functionality relying on OpenCV is the ICP algorithm, which will be addressed in #483.
Problem and summary of previous changes
In
apply_matrix
for rasters, after applying the affine transformation on the 3D point cloud converted from the DEM (which is an exact conversion, in this direction), we need to then approximate a re-gridded DEM from the transformed point cloud (for which the coordinates do not align with any grid anymore). This is an irregular gridding problem typically done using triangulation (Delauney triangles), and interpolating linearly in each (as done for point clouds by https://gdal.org/programs/gdal_grid.html#gdal-grid, PDAL through GDAL, and in Python byscipy.interpolate.griddata
).In #87 we removed
scipy.interpolate.griddata
as being too slow forapply_matrix
for rasters, and replaced it by a 2-step routine: (1) correct translations withscipy.interpolate.BivariateSpline
+ (2) correct small rotations with 2D deramping. This increased the speed a lot for everything, but deramping creates a lot of artefacts when approximating a rotation (as explained by @erikmannerfelt), and does not support scaling.New logic for affine transformation of rasters
This PR keeps the logic of #87 to differentiate the case of having only translations (to accelerate horizontal/vertical grid shifts), but implements a new iterative method to accurately and quickly resolve the regridding for small rotations, and re-introduces griddata (a bit slower but accurate) for the general case of any rotations/scaling. The artefacts @erikmannerfelt previously noticed from the
griddata
implementation were due to a bad propagation of NaNs during griddata (need to be derived as a function of the distance to the transformed points), so I also fixed this here by implementing a tested point cloud gridding function in GeoUtils: GlacioHack/geoutils#553.I also optimized the performance and added more flexibility to of a couple point cloud utilities: GlacioHack/geoutils#549, GlacioHack/geoutils#546, GlacioHack/geoutils#547, GlacioHack/geoutils#554.
Schematic of new implementation
The iterative method for small rotations aims to find the (unknown) elevation Z at (unknown) irregular coordinates X, Y which transform exactly onto the (known) regular grid X0, Y0 with (unknown = objective) elevation Z.
Because we know everything is linked by an exact affine transformation and that the initial DEM can be interpolated at any 2D coordinates X,Y to get Z, we can go faster than
griddata
. We search for coordinates X,Y derived from the inverse affine transform of X0, Y0 and an (unknown) Z, initialized with an initial closest-neighbor search, and then refined iteratively by interpolating the original DEM at coordinates X,Y (and we can re-use the same interpolator, which is quite fast).The algorithm typically converges (to X0,Y0 within 10e-5) independently for 99% of points in 4/5 iterations, and the last 1% of points take 4/5 more to reach the tolerance.
All grid points converges independently, and this works well even on a 1,000,000 point real-world DEM with outliers, noise, etc (our Longyearbyen example in this case).
Tests
This PR adds a lot of new tests for both implementations (griddata and the iterative one), some that live in GeoUtils with the underlying functionality.
Both methods are able to reproduce the real transformed elevations of the transformed DEM at a precision of 10e-5 for a constant slope DEM 🙂. For varying slopes, more differences arise as expected (near 90° slopes have multiple solutions), but mostly because we cannot check as robustly (2D linear interpolation is not equivalent before/after transformation anymore). The tests also check the spreading of NaNs through both methods to ensure they behave as expected.
Performance
Independently of the nearest neighbour search (detailed below) and data conversions (which we could try to minimize mroe generally at the package scale), the iterative algorithm is much faster than griddata, about 7 times faster for the Longyearbyen DEM! 🥳
Running a nearest neighbour search between two points clouds is necessary in both methods (to initialize the Z for the iterative affine method, and to get the locations of NaNs for the griddata method), and is currently one of the limiting factors (takes about 5 seconds on the Longyearbyen dataset, so almost as slow as
griddata
by just a factor 2).But using
scipy.KDTree
or refining themax_distance
parameter ofgeopandas.sjoin_nearest
could both improve speed a lot for this part of the methods: https://gis.stackexchange.com/questions/222315/finding-nearest-point-in-other-geodataframe-using-geopandas.UPDATE: Neither using the
max_distance
parameter ofsjoin_nearest
, nor usingscipy.spatial.KDTree
improved the speed at all, so I'm not including any additional changes on this in this PR. There might be a better way to initialize Z than with a nearest neighbour search (also utilizing knowledge from the affine transformation), which would make the "iterative" algorithm much faster overall, but I'm not inspired right now 🤔.UPDATE 2: Actually using the transformed elevation associated with the original point (even if not aligned very well in X/Y) works as well as a nearest neighbour to initialize, and only takes a couple more iterations (which are ~100x faster than a nearest neighbour search)! Problem solved, and we officially have a re-gridding for affine transforms around 10x faster than the usual irregular point re-gridding and as accurate! 🥳 🤟
Details
This PR:
to_pointcloud
forsubsample=1
, see Small fixes and optimizations forto_pointcloud
andinterp_points
geoutils#549,griddata
implementation for large rotations, see Add point cloud gridding geoutils#553,apply_matrix
.I will soon add a Python version of ICP, which will reduce even more the need for having OpenCV as an optional dependency (only if using ICP + asking for method="opencv").
Resolves #110
Resolves #499