Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

follow upstream scipy interpolation improvements #7704

Closed
dcherian opened this issue Mar 31, 2023 · 12 comments · Fixed by #9599
Closed

follow upstream scipy interpolation improvements #7704

dcherian opened this issue Mar 31, 2023 · 12 comments · Fixed by #9599

Comments

@dcherian
Copy link
Contributor

dcherian commented Mar 31, 2023

Is your feature request related to a problem?

Scipy 1.10.0 has some great improvements to interpolation (release notes) particularly around the fancier methods like pchip.

It'd be good to see if we can simplify some of our code (or even enable using these options).

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

@hollymandel
Copy link
Contributor

I would like to work on this. For clarity, the goal is to support tensor product interpolation wherever scipy interpn supports it, and remove any no-longer-necessary dimension checks (as in #9049)?

@dcherian
Copy link
Contributor Author

dcherian commented Sep 5, 2024

Nice!

the goal is to support tensor product interpolation wherever scipy interpn supports it, and remove any no-longer-necessary dimension checks

Yes, I believe so.

@dcherian
Copy link
Contributor Author

dcherian commented Sep 5, 2024

We may also want to replace interp1d with our own version that class make_interp_spline as mentioned here: #9404

@hollymandel
Copy link
Contributor

hollymandel commented Sep 12, 2024

The 1d interpolation situation seems to have a few additional inconsistencies, maybe due to updates to the scipy interface.

  1. da.interp() cannot access any of the methods in Interp1dOptions due to the vectorizeable_only argument, as raised by Cannot use documented interp() methods due to "vectorizeable_only" check and kwargs name clash #9049. However, these methods are accessible via da.interpolate_na(). I suppose this could provide a workaround if someone needed these arguments.
  2. I believe that "polynomial" interpolation (in _get_interpolator, ScipyInterpolator) is misnamed and actually refers to spline interpolation, since this is how SciPy interp1d will interpret it. Indeed polynomial interpolation does not take a degree argument. However polynomial interpolation is being implemented by the BarycentricInterpolator and KroghInterpolator which are supported.
  3. I think that the implementation of spline interpolation (e.g. SplineInterpolator, accessible via interpolate_na()) is deprecated and is now performing smoothing splines rather than spline interpolation, perhaps due to the deprecation of the argument nu.
  4. It seems better to me if interpolate_na() were accomplished via a call to da.interp(), or vice versa.

I think that these point (maybe except 4) will be resolved in the course of solving the original issue, just wanted to make sure I'm on the right wavelength.

@dcherian
Copy link
Contributor Author

dcherian commented Sep 16, 2024

This seems right. I recommend opening as small a PR as possible for easy review rather than a large one that solves many issues. Let us know if you need help.

Does (1) seems like an easier place to start?

@hollymandel
Copy link
Contributor

hollymandel commented Sep 17, 2024

I am having a hard time tracing the impact of the vectorizeability of _interp_func. I wonder if there is some decorator-type behavior about detecting the vectorizeability of functions that is hidden from the stack trace?

The vectorizeable_only flag is not explicitly being passed beyond the choice of interp_class, but the choice of interp function determines the shape of the var that is passed to _interpnd.

Thanks for any insight.

@dcherian
Copy link
Contributor Author

What happens if you set vectorizeable_only to False in interp_func?

@hollymandel
Copy link
Contributor

It fails test_interpolate_vectorize in test_interp.py. Because if vectorizeable_only is False then it will use numpy.interp to interpolate. This will eventually result in an object too deep for desired array error. The reason this is mysterious to me is that this change affects the input var to the interp_func to be two-dimensional, whereas if you leave the flag in its original position the input var is one-dimensional. I cannot find where the behavior diverges prior to the call of interp_func.

@dcherian
Copy link
Contributor Author

Sorry that is a bit gnarly, this module hasn't been touched in a while, so we lack some context.

Do (2) or (3) in your list above feel more approachable?

@hollymandel
Copy link
Contributor

hollymandel commented Sep 23, 2024

I have been unable to reproduce the strange behavior described in my previous comment so I think it's actually behaving reasonably. Thanks for the response. I have submitted a PR related to #9049 and will continue working on this.

@hollymandel
Copy link
Contributor

I've moved on to the implementation of tensor product interpolators via scipy.interpolate.interpn. This has raised the following design question:

To my understanding, a few of the new interpolators (cubic and quintic tensor product splines) are "genuinely multidimensional", so an equivalent result would not be produced by applying a lower-dimensional analogue along dimensions sequentially. However missing.interp has a built in optimization to decompose interpolations in this way when possible. The optimization seems to have been introduced in response to this issue. I'm confused by the fact that overriding scipy's internal logic results in a speedup but it seems true, and there may be users relying on this optimization.

One solution would be to disable this optimization when a "genuinely multidimensional" interpolator is encountered. This would solve the issue and be backwargs-compatible. The only issue is that it would require me to figure out which interpolators are genuinely multidimensional! But the worst case scenario here is just a missed optimization and perhaps some embarrassment. My real dream would be to "pass the buck to scipy"--write things in a way that does not require any understanding of the scipy interpolators.

@dcherian
Copy link
Contributor Author

The only issue is that it would require me to figure out which interpolators are genuinely multidimensional! But the worst case scenario here is just a missed optimization and perhaps some embarrassment.

This should only be true for "linear" and "nearest". A test that compares our output to scipy's output should confirm this (and would be a good test to have!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants