Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Homogenize API arguments #440

Merged
merged 12 commits into from
Jan 19, 2024
Merged

Conversation

atedstone
Copy link
Member

@atedstone atedstone commented Jan 18, 2024

This PR causes breaking changes to API

Resolves the rest of #432.

  • Remove dst_
  • Replace rst with raster
  • grid_size to size
  • The various forms of sample (see below)
  • Always use points, never pts

Grammar of sampling

This PR proposes to regularise the grammar around the word sample and its derivatives.

  • When the intention is to return a (random) sample, the term sample is used. Previously, subset or subsample were both used to denote this purpose.
  • downsample or upsample defines a change in a Raster's resolution. The more generic term is resample, but the use of a specific direction is preferred. Try not to use resample as this has connotations of towards reproject.
  • The terms subset and subsample refer to an contiguous portion of data (e.g. an x*y window of a Raster, specific bands of a Raster, or row n...row m of a Vector), without a change in resolution. I am regularising to subset.

@atedstone atedstone marked this pull request as ready for review January 19, 2024 10:47
Copy link
Contributor

@rhugonnet rhugonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @atedstone, this is great!! I didn't find anything to comment on in the code changes.

I like the homogenization to sample and differentiation with subset! 😄
It looks like subset is used only in split_bands, we could even rename into something specific to that function like bands to avoid any kind of confusion?

Copy link
Member

@adehecq adehecq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work !
As discussed on slack, I find that removing the dst_ prefix everywhere makes sense from a user/API point of view, but it may be a bit less easy to read the code. We'll see!
I find that size is not clear enough. I would favor grid_size to clarify that we're talking about the dimensions of the array. But we can revert in a future PR if we all agree.

@adehecq
Copy link
Member

adehecq commented Jan 19, 2024

It looks like subset is used only in split_bands, we could even rename into something specific to that function like bands to avoid any kind of confusion?

For consistency, we could use the same terminology as in load which is indexes. Although I'm not a huge fan of that term, I always forget the name...

@adehecq adehecq merged commit 5dcf6ef into GlacioHack:main Jan 19, 2024
13 checks passed
@atedstone
Copy link
Member Author

I also don't find indexes that intuitive. From the xarray side of life, index can be along any dimension. How about the keyword band/bands instead, for both load() and split_bands()?

@adehecq
Copy link
Member

adehecq commented Jan 22, 2024

I also don't find indexes that intuitive. From the xarray side of life, index can be along any dimension. How about the keyword band/bands instead, for both load() and split_bands()?

I would be fine with bands. But this would just be undoing what was done in a previous PR: #346. Any thoughts @rhugonnet ?

@rhugonnet
Copy link
Contributor

+1 for grid_size and for bands!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants