Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negative values predicted by interp_nd_binning problematic for error analysis #380

Closed
rhugonnet opened this issue Jun 28, 2023 · 2 comments · Fixed by #389 or #446
Closed

Negative values predicted by interp_nd_binning problematic for error analysis #380

rhugonnet opened this issue Jun 28, 2023 · 2 comments · Fixed by #389 or #446
Labels
priority Needs to be fixed rapidly

Comments

@rhugonnet
Copy link
Member

rhugonnet commented Jun 28, 2023

Opening after a discussion started by Enrico Mattea:

The linear extrapolation of interp_nd_binning can result in negative values (and, more generally, yield curious values) for points outside of the binning maximum bounds, especially as the "edge bins" tend to have less points to sample from, and thus more variability:

# RegularGridInterpolator to perform linear interpolation/extrapolation on the grid

It does happen that slope/curvature have very high values because of outliers, and those can be ignored during error prediction (min_vals argument of interp_nd_binning).
If those outliers remain unfiltered in the dDEM, some of the predicted errors could then be negative? (untested)

So: should we rather default to nearest neighbour outside of the bin bounds, and leave the linear extrapolation option to the user?

@adehecq
Copy link
Member

adehecq commented Jul 6, 2023

So interp_nd_bininng does linear extrapolation outside the bounds? I thought that it was doing nearest neighbor from looking at the docstring.
"Fills the no-data present on the regular N-D binning grid with nearest neighbour from scipy.griddata, then provides an interpolant function that linearly interpolates/extrapolates using scipy.RegularGridInterpolator."
So in which case does it use the nearest neighbor and which case the RegularGridInterpolator?

I would indeed put NN as the default and linear extrapolation as an option. Alternatively, one could also include an option to set the bounds of the extrapolated values.

@rhugonnet
Copy link
Member Author

The nearest neighbour was just to fill nodata values for points of the ND binning (not supported by interpolators). All other interpolation (between bins and outside bins) was linear.

I've fixed all and hopefully clarified the descriptions in #389

@rhugonnet rhugonnet added the priority Needs to be fixed rapidly label Aug 1, 2023
MatteaE added a commit to MatteaE/xdem that referenced this issue Nov 9, 2023
rhugonnet added a commit that referenced this issue Nov 16, 2023
)

* Fixed use of nearest-neighbour extrapolation when using interpolation_method='linear' in interp_nd_binning (#380)

* Added possibility in fit_sum_model_variogram to pass a value for maxfev, which is passed on to curve_fit

* Added possibility to save plots to file (variogram, 1d- and 2d-binning)

* Add pd.interval reformatting to nd_binning plotting functions

* Linting

* Add test for fix

* Tentative fix for plot1d and 2d binning with string pd.Interval

---------

Co-authored-by: Romain Hugonnet <romain.hugonnet@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority Needs to be fixed rapidly
Projects
None yet
2 participants