-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dtype validation when rescale=True #209
Add dtype validation when rescale=True #209
Conversation
@gjoseph92 I would like to assign you as a reviewer. I am hoping this mention enables this as per the solution here (I am more of a GitLab person myself). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @RSchueder! This looks like a good start. I was thinking to make the check conditional on the scale
and offset
values we actually read from STAC metadata, though. If you specify dtype=int, rescale=True
, but all assets have scale=1, offset=0
(or no scale and offset at all), that shouldn't cause this error. Probably right after we pull out the values from metadata here:
stackstac/stackstac/prepare.py
Line 158 in 7836a36
asset_offset = raster_bands[0].get('offset', 0) |
An easy check (for v in [scale, offset]
) would just be np.can_cast(v, dtype)
. I think should catch:
- Wrong dtype kind (int vs float)
- Wrong dtype size (uint8 vs uint64)
- Wrong dtype signing (uint8 vs int8) (this one is a very rare case, but also slightly tricky, since I suppose
offset=-1
doesn't necessarily mean the array needs to be signed, so long as all data values are >0)
stackstac/stack.py
Outdated
@@ -276,6 +276,9 @@ def stack( | |||
The size of ``y`` and ``x`` will be determined by ``resolution`` and ``bounds``, which in many cases are | |||
automatically computed from the items you pass in. | |||
""" | |||
if rescale: | |||
assert "float" in np.dtype(dtype).name, "The requested dtype is incompatible with rescale=True, which requires a floating-point dtype." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert "float" in np.dtype(dtype).name, "The requested dtype is incompatible with rescale=True, which requires a floating-point dtype." | |
assert np.dtype(dtype).kind in ("f", "c"), "The requested dtype is incompatible with rescale=True, which requires a floating-point dtype." |
nit: using dtype.kind
is probably the more proper way to write this test. Also, complex floats would be valid too (could come up for SAR data).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I was searching for a good way to do this and could not find one, this is handy.
@gjoseph92 I am not sure we can ever have a If this is true, isn't actually checking the value a bit redundant since we know it will be a float? |
Okay, so even if the |
Sorry for the delay @RSchueder. I'd still like to move the check around here stackstac/stackstac/prepare.py Line 158 in 7836a36
That way, you get the error immediately, instead of having to wait until |
Now worries @gjoseph92! Gotcha, I didn't want to modify the signature of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @RSchueder! I appreciate thinking about not modifying the signature. Since that's an internal function, it didn't matter much in this case.
I just pushed up a slight clarification to the error message you wrote. Looks good!
This MR adds validation of the inputs to
stackstac.stack
to ensure that the outputdtype
is compatible whenrescale=True
. Rescaling requires that the output dtype is floating-point.