Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix gradient search for single band data #414

Merged
merged 1 commit into from
Feb 4, 2022

Conversation

pnuu
Copy link
Member

@pnuu pnuu commented Jan 26, 2022

Helping out a colleague, I noticed that gradient search resampler doesn't work if a single-channel dataset (shape (1, y, x)) are used.

The error message was:

Traceback (most recent call last):
  File "/home/lahtinep/bin/test_akun_resamplays.py", line 40, in <module>
    main()
  File "/home/lahtinep/bin/test_akun_resamplays.py", line 31, in main
    scn2 = scn.resample(suomi, resampler='gradient_search', fill_value=0)
  File "/home/lahtinep/Software/pytroll/pytroll_packages/satpy/satpy/scene.py", line 906, in resample
    self._resampled_scene(new_scn, destination, resampler=resampler,
  File "/home/lahtinep/Software/pytroll/pytroll_packages/satpy/satpy/scene.py", line 822, in _resampled_scene
    res = resample_dataset(dataset, destination_area, **kwargs)
  File "/home/lahtinep/Software/pytroll/pytroll_packages/satpy/satpy/resample.py", line 1395, in resample_dataset
    new_data = resample(source_area, dataset, destination_area, fill_value=fill_value, **kwargs)
  File "/home/lahtinep/Software/pytroll/pytroll_packages/satpy/satpy/resample.py", line 1358, in resample
    res = resampler_instance.resample(data, **kwargs)
  File "/home/lahtinep/Software/pytroll/pytroll_packages/pyresample/pyresample/resampler.py", line 126, in resample
    return self.compute(data, cache_id=cache_id, **kwargs)
  File "/home/lahtinep/Software/pytroll/pytroll_packages/pyresample/pyresample/gradient/__init__.py", line 274, in compute
    res = xr.DataArray(res, dims=data_dims, coords=coords)
  File "/home/lahtinep/mambaforge/envs/pytroll/lib/python3.9/site-packages/xarray/core/dataarray.py", line 406, in __init__
    coords, dims = _infer_coords_and_dims(data.shape, coords, dims)
  File "/home/lahtinep/mambaforge/envs/pytroll/lib/python3.9/site-packages/xarray/core/dataarray.py", line 102, in _infer_coords_and_dims
    raise ValueError(
ValueError: coords is not dict-like, but it has 3 items, which does not match the 2 dimensions of the data

This PR fixes this by only applying .squeeze() in the case of result being of higher dimensionality than the dims/coords would indicate.

  • Closes #xxxx
  • Tests added
  • Tests passed
  • Passes git diff origin/main **/*py | flake8 --diff
  • Fully documented

@pnuu pnuu added the bug label Jan 26, 2022
@pnuu pnuu requested a review from mraspaud January 26, 2022 14:32
@pnuu pnuu changed the title Squeeze the result array only if the bands and dimensions don't match Fix gradient search for single band data Jan 26, 2022
@codecov
Copy link

codecov bot commented Jan 26, 2022

Codecov Report

Merging #414 (3929c75) into main (565966f) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #414   +/-   ##
=======================================
  Coverage   93.86%   93.87%           
=======================================
  Files          65       65           
  Lines       11125    11131    +6     
=======================================
+ Hits        10443    10449    +6     
  Misses        682      682           
Flag Coverage Δ
unittests 93.87% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pyresample/gradient/__init__.py 97.95% <100.00%> (+0.01%) ⬆️
pyresample/test/test_gradient.py 98.00% <100.00%> (+0.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 565966f...3929c75. Read the comment docs.

@djhoese
Copy link
Member

djhoese commented Jan 26, 2022

Does this error suggest that gradient search can't handle 3D+ inputs at all?

@coveralls
Copy link

Coverage Status

Coverage increased (+0.003%) to 93.695% when pulling 3929c75 on pnuu:bugfix-gradient-single-channel into 565966f on pytroll:main.

@pnuu
Copy link
Member Author

pnuu commented Jan 26, 2022

I think it will handle N-D just fine, but in 2-D case with additioal 'bands' dimension the output array was .squeeze()'d and dims/coords didn't anymore match the array shape after resampling.

@djhoese
Copy link
Member

djhoese commented Jan 26, 2022

Sorry I'm a little slow today. Doesn't this mean that the input data should have the bands dimension removed before being resampled so then the res output array is only 2D?

@pnuu
Copy link
Member Author

pnuu commented Jan 27, 2022

That's an alternative, yes. But if the input has a shape (num_bands, y, x) I'd expect the output also to be (num_bands, y2, x2) instead of (y2, x2) for some cases (num_bands = 1).

@djhoese
Copy link
Member

djhoese commented Jan 27, 2022

I guess what I'm trying to say is why isn't the low level function getting a 2D array and then the higher level code is reconstructing a 3D array if needed?

@pnuu
Copy link
Member Author

pnuu commented Jan 27, 2022

So first squeezing and then then re-expanding if necessary? Nope. This "squeeze if necessary" is a one step way for the same result. If the input data already have .ndims = 2 (no 'bands') then also the output will be that way.

@djhoese
Copy link
Member

djhoese commented Jan 27, 2022

What I'm saying is if this resampling already handles multi-band images then how/why is your needed? If this method is only supposed to handle 2D arrays and the caller is supposed to handle the split/expand of a multi-band image (ex. RGB) then this change makes more sense but I'm still wondering why it is getting a 3D array as input if the caller is supposed to give it a 2D array.

@pnuu
Copy link
Member Author

pnuu commented Jan 27, 2022

That's what the generic_image reader returned for an L-mode image, array with a shape (1, y, x). The data could also be from other sources that have for example a time dimension on otherwise 2D data, like what our writer='cf' does.

@djhoese
Copy link
Member

djhoese commented Jan 27, 2022

So what happens when you give an RGB to this resampler?

@pnuu
Copy link
Member Author

pnuu commented Jan 27, 2022

It works exactly like it did before.

@djhoese
Copy link
Member

djhoese commented Jan 27, 2022

I'm not suggesting your change breaks this. I'm suggesting that this fix isn't fixing the whole problem. If an RGB is given to this resampler, does it fail? If so, then the longer term fix (alternative to this current fix) would be to split the arrays, resample, and rejoin. If it doesn't fail, then there is something weird going on with dimensions here that whatever logic is supposed to be handling non-geo dimensions is not handling them correctly.

@pnuu
Copy link
Member Author

pnuu commented Jan 27, 2022

No, RGB doesn't fail, as it has both dims and coords for all the dimensions, and the .ndims of the array doesn't change, as .squeeze() isn't removing the 3-band dimension like it would do for an array based on L mode (1, y, x) data. In the L case there would be an extra 'bands' value in the .coords and .dims that doesn't exist in the output array because it was erroneously .squeeze()'d away. The way it is handled in this PR is consistent in retaining the shape, dimensions and coordinates of the input data.

@djhoese
Copy link
Member

djhoese commented Jan 27, 2022

Feel free to merge or wait for @mraspaud review.

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but it would be nice to know why :)

@mraspaud mraspaud merged commit 1928289 into pytroll:main Feb 4, 2022
@pnuu
Copy link
Member Author

pnuu commented Feb 4, 2022

Xarray, correctly, doesn't add 3D coordinate/dimension labels to a 2D dataset. The input being originally a 3D representation of 2D data with dimensions (bands, y, x) where bands = ['L'], and the original code squeezing the single-value 'bands' dimension away from the array, the dimensionalities of the data and dims didn't match anymore.

@pnuu pnuu deleted the bugfix-gradient-single-channel branch February 4, 2022 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants