Re-structure coregistration affine methods, point-raster pre-processing, binning and fitting, and add pixel interpretation support #530

rhugonnet · 2024-05-24T02:00:44Z

This PR restructures affine coregistration methods and many generic subfunctions to be consistent and modular.
Primarily, it makes 1/ "point-raster" subsampling, 2/ any binning and/or fitting, and 3/ iteration over methods, the same across all Affine and BiasCorr methods with extensive modularity.
Additionally, this PR updates horizontal shift reprojection from using a different implementation in every method (e.g. Nuth and Kääb used scipy.RectBivariateSpline) to using consistently interp from GeoUtils which supports any resampling method (for which details can be chosen in global config parameters), with consistent nodata propagation built on top, and preserves computational efficiency by allowing to return a single interpolator object re-used (GlacioHack/geoutils#560). This also allows to avoid sub-pixel errors of Rasterio reprojection.
It also re-structures the code of affine methods, defining core functions outside the classes' _fit_rst_rst or _fit_rst_pts, which allows to have consistent pre-processing steps, and clear input/outputs.
Finally, this PR adds support for pixel interpretation for coregistration methods (important for point-raster operations).

Resolves #234
Resolves #574
Resolves #561
Resolves #546
Resolves #573
Resolves #571
Resolves #547
Resolves #327 (except plot but those are mentioned in #139)
Big step forward for #435

Opens #572
Opens #575
Opens #579

Summary of changes

1/ New core processing functions

For consistent binning/fitting:
Restructures the coreg/affine module to support the fit_or_bin logic previously added in coreg/biascorr: However, as in the case of coregistration, the fit function is fixed (that of the method; for instance dh/slopetan = aspect for Nuth and Kääb), the "bin" option alone is impossible and only "fit" or "bin_and_fit" are supported. Therefore, the mirrored structure simply allows to pass a boolean argument bin_before_fit to perform binning before fitting, as well as any fit optimizer, bin statistic, and bin sizes.
To do this, this PR modifies the bin and fit of BiasCorr into a single _bin_or_and_fit_nd function in coreg.base used by all Affine and BiasCorr methods.

For consistent point-raster/raster-raster subsampling:
This pre-processing step is different from that running directly in Coreg.fit(), which simply extracts the transform, crs, and others. This step revolves around subsampling the point-raster (raster-raster, or point-raster) at the same points for a valid subsample (no NaNs) of a certain size. This cannot be done consistently in fit() before _fit_xxx, because auxiliary variables might need to be derived by the method on the full arrays (such as slope, aspect, etc), which propagate NaNs. In particular, the tricky aspect is the case of point-raster, where we need to first interpolate the raster at the coordinates of the points to get an idea of the available valid subsample (depending here also on NaN propagation).
To do this, this PR modifies the subsampling that existed in BiasCorr into a single _preprocess_pts_rst_subsample function in coreg.base relying on two steps: _get_subsample_mask_pts_rst and _subsample_on_mask. In order to allow to return a dh interpolator that can be re-used iteratively for some affine methods, another function _preprocess_pts_rst_subsample_with_dhinterpolator in coreg.affine relies on a different second step _subsample_on_mask_with_dhinterpolator.

For consistent iterative methods:
Methods such as ICP, Nuth and Kääb, etc, all work based on iterations. To have consistent treatment of inputs/outputs and iterative arguments (tolerance to reach, maximum number of iterations), those are now grouped through an _iterate_method function.

Ensuring consistent argument across these core functions and other parameters:
To ensure the argument passed to these three core functions (subsampling, binning/fitting, and iteration) are consistent, new dictionary types are defined for each which defines their key names and value types: InFitOrBinDict ("In" for input), InRandomDict and InIterativeDict. Additionally InAffineDict for affine-related parameters, and InSpecificDict for any method-specific parameters that does not fit other categories.
Same is done for outputs: OutRandomDict, etc.
See #573 for details on the new organization.

2/ Robust and consistent horizontal shift reprojection

The core of fit_ functions of horizontal methods such as NuthKaab and GradientDescending has been modify to rely on the _preprocess_pts_rst_subsample_with_dhinterpolator mentioned above.
This function relies on geoutils.interp_points (directly for point-raster, and through a new function _reproject_horizontal_shift_samecrs for raster-raster) to perform a consistent 2D grid interpolation with nodata propagation supporting any SciPy method, see details here: GlacioHack/geoutils#560.
This PR adds a test checking that the result of this functionality matches GDAL reprojection, which ensures Rasterio sub-pixel reprojection errors are gone.

3/ Re-structuration of affine classes

All code from affine classes is moved out of the _fit_xxx functions into functions that rely on the same steps where possible.
For instance, the NuthKaab method is divided into a _nuthkaab_fit_func to define the function to optimize, a _nuthkaab_fit which wraps _bin_or_and_fit_nd with some initial guess, a _nuthkaab_aux_vars that derives auxiliary variables required for the fit (slope, aspect), and a _nuthkaab_iteration_step that is passed to _iterate_method.

All the _apply_xxx affine functions are removed to avoid unnecessary redundancies/inconsistencies, as the apply_matrix is the same across all affine methods, and since #531 supports resample=False that modifies only the geotransform for horizontal shifts.

4/ Add support for pixel interpretation

Dealing with area_or_point was previously done arbitrarily in each subclass. Using the above interpolation/reprojection (through geoutils.interp_points), we can now simply pass the argument to GeoUtils who accounts for it when comparing point-raster, or raises a warning for raster-raster having a different interpretation.
It is only required for the fit part of coregistration, and thus added to all related functions.

5/ Removal of `Tilt`, and other subfunctions

With the above changes, the affine Tilt function corresponds exactly to Deramp(order=1) (instead of having its own, separate implementation). Additionally, it is not truly affine (cannot be modelled by an affine matrix), and thus has no place in affine methods. We can simply remove, and users can rely on Deramp instead.
All sub-functions that were created to perform some processing steps in coregistration modules (many times duplicated from another method, or from a GeoUtils functionality) for a specific affine method are now removed. This is largely thanks to GlacioHack/geoutils#560 which re-structures GeoUtils raster functionalities to be called without having to build a Raster object (simply passing array, transform, CRS, etc).
Until now, we were deconstructing/reconstructing Raster objects everywhere to call simple functions (interp_points, coords, etc), which made us lose performance. This is now largely solved too!
A couple exceptions remains: warp_dem used only by BlockwiseCoreg and _mask_as_array used only by the dDEM class, but their use seems more specific and I'm not sure it is covered by anything in GeoUtils right now.

Tests

I did not write new tests on purpose in this PR to ensure I could check that all these changes do not affect our current methods significantly. Improving tests (in particular for affine methods) will be addressed in a next PR.

And indeed, all tests are passing with minor modifications! 🥳

Additionally, I increased the speed of the tests following #574 (ICP tests were extremely slow and cumulated to 5min+, or half of xDEM's test length). Now nearly all the tests of test_coreg/ run on a cropped version of the reference/tba example DEMs.
And all of xDEM's test run in 5min instead of 12min+. 😃

To-do-list

Will do in separate PRs

Add robust tests as in BiasCorr, see list in Improve tests in test_affine once core class restructured with BiasCorr parameters #485,
Document in Update documentation and warnings before 0.1 release #502.

…roject

rhugonnet · 2024-08-28T01:38:10Z

@adehecq @erikmannerfelt This PR is ready for your review!
All tests passing locally with the GeoUtils branch GlacioHack/geoutils#560. 🥳

I have two main aspects on which I'm hesitating on how to move forward, and would especially like your opinion on:

Currently I defined a nuth_kaab main function called by NuthKaab._fit_xxx directly (and this for all methods), which is itself made of some of the new core subfunctions (listed in the main description, point 1/) and a method-specific one. This ensure things are consistent, but is annoying to save variables to the Coreg metadata that are only available at a given step. For instance, for an easy one: What is the final subsample size on valid values? Not available right now. A harder one: What is the binning/parameters at a given iteration step? (might not be necessary, but could become a need in the future for plotting or other)
I see two solutions here to make that easier:
a) Either we move the content of nuth_kaab to NuthKaab ._fit_xxx, and wrap the subfunctions called in nuth_kaab as methods of Coreg which always pass/save the metadata consistently when they run (you can see I already did this with _bin_or_and_fit_nd for BiasCorr, it is duplicated as a Coreg method in order to pass/write the metadata). It would make it impossible to call the same nuth_kaab separately than through the Coreg class (but I'm not sure that's a need?), but make those new class methods consistent.
b) Or we define new function inputs/outputs (dictionaries?) for each of these subfunctions, to pass metadata of interest through the functions back to the class (could become a hassle very fast).
The CoregDict is getting busy and all over the place. It seems like it could become a combination of the new dicts created in this PR: RandomDict, FitOrBinDict, IterativeDict (and maybe an output dict on the results?)...
How should we structure this? Having a nested dictionary would make it well organized, but maybe harder for users to access a specific key easily, even though it's all there. If we did this, we could add an info() method that prints the metadata nicely to partially remedy that. Any other ideas?

rhugonnet · 2024-09-05T00:49:16Z

Agreed on all.
Except maybe for "it is certainly more difficult to understand when reading the code, maintain and debug.". I think in time (getting used to the functions) and maybe with a Wiki page, it will be much easier to maintain and debug a couple consistent core functions, rather than each Coreg subclass doing its own thing 😉.

For computation times: Similar for Raster-Raster, and faster for Point-Raster (due to avoiding rebuilding a Raster to call interp_points, now this is done from the array/transform directly 😄).

Good idea, I will try to write a short Wiki page why this is fresh!

rhugonnet · 2024-09-05T01:10:44Z

Merging like this!
Want to know @erikmannerfelt's opinion before changing Coreg.meta into Coreg.inputs and Coreg.outputs. I actually don't dislike meta too much (data about other data: the coregistration algorithm in this case), as long as it's organized inside.

rhugonnet · 2024-09-05T22:39:57Z

FYI: I've added a Wiki page here on the Coreg structure and steps to implement a new method: https://github.com/GlacioHack/xdem/wiki/Things-to-know-on-coregistration-class-structure

adehecq · 2024-09-06T11:37:15Z

FYI: I've added a Wiki page here on the Coreg structure and steps to implement a new method: https://github.com/GlacioHack/xdem/wiki/Things-to-know-on-coregistration-class-structure

Perfect, thanks a lot ! That's super useful !

rhugonnet added 2 commits May 23, 2024 17:58

Incremental commit on affine reorg

53c34f3

Merge remote-tracking branch 'upstream/main' into reorg_affine

ea0d71c

rhugonnet marked this pull request as draft June 7, 2024 16:14

Incremental commit to reorg

77eabe1

rhugonnet mentioned this pull request Jun 17, 2024

Bug with raster-point coregistration #546

Closed

rhugonnet added 4 commits August 6, 2024 14:49

Fix resolution error

11449c1

Merge remote-tracking branch 'upstream/main' into reorg_affine

f2c5952

Finalize same CRS reprojection based on SciPy, add test with GDAL rep…

d23e3f1

…roject

Add further comments on tests

0a4cca4

rhugonnet mentioned this pull request Aug 7, 2024

Add a single consistent function for iterating methods to a tolerance #561

Closed

rhugonnet changed the title ~~Re-structure AffineCoreg subclasses to be more modular~~ Re-structure AffineCoreg subclasses to be more consistent and modular Aug 9, 2024

rhugonnet mentioned this pull request Aug 9, 2024

[POC] cli step 3 add coreg nuth&kaab #554

Open

7 tasks

rhugonnet added 3 commits August 22, 2024 17:23

Incremental commit to new affine coregs

7272267

Finalize new structure

4c1944f

Linting but not all mypy

530eaf5

rhugonnet mentioned this pull request Aug 27, 2024

Make mypy happy when initializing TypeDicts from coregistration methods #571

Closed

Linting

fad5e89

rhugonnet mentioned this pull request Aug 27, 2024

Add tests for pixel interpretation in coregistration methods #572

Open

rhugonnet changed the title ~~Re-structure AffineCoreg subclasses to be more consistent and modular~~ Consistent coregistration: re-structure affine methods, group point-raster pre-processing, binning a and pixel interpretation support Aug 28, 2024

rhugonnet requested review from adehecq and erikmannerfelt August 28, 2024 01:38

Finalize tests and linting

be3bd68

rhugonnet marked this pull request as ready for review August 28, 2024 21:56

rhugonnet mentioned this pull request Aug 28, 2024

Organize coregistration metadata #573

Closed

rhugonnet added 2 commits August 29, 2024 14:46

Implement new coreg nested dictionary structure

c907bbd

Linting and corrections

ba0a7e3

rhugonnet mentioned this pull request Aug 30, 2024

Speed-up ICP-related tests #574

Closed

rhugonnet mentioned this pull request Sep 3, 2024

Add options to customize print statements of iteration function #579

Open

Amaurys comments

b31acf9

rhugonnet dismissed adehecq’s stale review via b31acf9 September 4, 2024 00:24

rhugonnet added 12 commits September 3, 2024 17:33

Update geoutils' version

ed62d57

Try with lower threshold

6f62783

Fix tests

3d13cfa

Linting

1076a2b

Change default tolerance for tests

021d783

Try like this

b01bbf7

Linting

0c142f4

Add tests for Coreg.info()

4884a12

Linting

0a2d2b4

Add test for Coreg.info

8cdc72a

Enlarge tolerance now that shifts are not in pixels anymore

57d2055

Skip DirectionalBias test

bb0b6e3

rhugonnet merged commit 86f8a57 into GlacioHack:main Sep 5, 2024
21 checks passed

rhugonnet deleted the reorg_affine branch September 5, 2024 01:10

This was referenced Sep 5, 2024

Add ICP in Python #581

Draft

Optimize coregistration module performance for Xarray/Raster objects #543

Closed

[POC] Report #569

Open

adehecq mentioned this pull request Sep 17, 2024

[POC] cli 4 : adopt every coregistration method #556

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-structure coregistration affine methods, point-raster pre-processing, binning and fitting, and add pixel interpretation support #530

Re-structure coregistration affine methods, point-raster pre-processing, binning and fitting, and add pixel interpretation support #530

rhugonnet commented May 24, 2024 •

edited

Loading

rhugonnet commented Aug 28, 2024 •

edited

Loading

rhugonnet commented Sep 5, 2024

rhugonnet commented Sep 5, 2024

rhugonnet commented Sep 5, 2024

adehecq commented Sep 6, 2024

Re-structure coregistration affine methods, point-raster pre-processing, binning and fitting, and add pixel interpretation support #530

Re-structure coregistration affine methods, point-raster pre-processing, binning and fitting, and add pixel interpretation support #530

Conversation

rhugonnet commented May 24, 2024 • edited Loading

Summary of changes

1/ New core processing functions

2/ Robust and consistent horizontal shift reprojection

3/ Re-structuration of affine classes

4/ Add support for pixel interpretation

5/ Removal of Tilt, and other subfunctions

Tests

To-do-list

Will do in separate PRs

rhugonnet commented Aug 28, 2024 • edited Loading

rhugonnet commented Sep 5, 2024

rhugonnet commented Sep 5, 2024

rhugonnet commented Sep 5, 2024

adehecq commented Sep 6, 2024

rhugonnet commented May 24, 2024 •

edited

Loading

5/ Removal of `Tilt`, and other subfunctions

rhugonnet commented Aug 28, 2024 •

edited

Loading