-
Notifications
You must be signed in to change notification settings - Fork 535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WarpedVRT.read(masked=True) and WarpedVRT.read_masks don't use overviews #1373
Comments
Thanks for pointing me at marblecutter's mask reader. That's what I'd like to accomplish in Rasterio, but for nodata more generally (nodata values, alpha bands, sidecars, and internal bitmasks). As far as I can tell, reading a mask from Rasterio's WarpedVRT when the source has a sidecar or internal bitmask is going to be super slow (requiring materialization of much of the source) until we can get a proper VRT mask band added to the result of |
Yup. I would love to use internal masks some day, but this is a reasonable workaround in the meantime. |
@mojodna #1378 forbids boundless reads from WarpedVRTs, because the boundless read implementation now uses an ordinary VRT (as an XML string) and there's no way to reference a WarpedVRT from the XML. I think that you should be able to make the call at https://github.com/mojodna/marblecutter/blob/a4f66cbf7124f25cf15bec8bc0e37ef195119de6/marblecutter/__init__.py#L277-L279 obsolete by setting the parent WarpedVRT's extents to this window. Before I removed WarpedVRT from the boundless read implementation I found that I needed to buffer its extents by a few pixels in order to trigger GDAL's overview use. That's why the width and height of the WarpedVRT in the old boundless read impl were incremented by 1. |
Sorry, I don't follow. Are you suggesting that I set the extent for: https://github.com/mojodna/marblecutter/blob/a4f66cbf7124f25cf15bec8bc0e37ef195119de6/marblecutter/__init__.py#L264-L272 If so, how would I do that? Through the combination of
Does rasterio include a simple way to buffer? (It's been a while since I've touched this bit of code, so I'm less confident in my ability to buffer it myself.) |
@mojodna I haven't tested the code below, but given From the Use these to construct a WarpedVRT and then read the full extent of it into the
Please disregard what I wrote about the pixel buffer, it's not relevant. But if you did want to pad the VRT for other reasons, you could translate its transform matrix west and north by a pixel's distance, add 2 pixels to the width and height, and then call The key to triggering use of the overviews is this: the resolution of your output, defined by I have an example at https://github.com/mapbox/rasterio/blob/master/tests/test_warpedvrt.py#L329 of external overviews being used when decreasing the resolution of read data by a factor of two. |
FYI, here is how I manage to fetch the overview and doing a boundless like tile read (using ☝️ Sean guidance) https://gist.github.com/vincentsarago/bf2949dc0e3488def3caab71f3ced416 |
I'm still having problems with this. When overviews are triggered on a source with an internal mask, the materialized mask appears to be incorrect: # rio insp https://mojodna-temp.s3.amazonaws.com/internal-mask.tif
from affine import Affine
from rasterio.vrt import WarpedVRT
# 21/1251419/852238
transform1 = Affine(0.04, 0, 3876179.03, 0, -0.04, 3751873.31)
bounds1 = (3876179.0321125016, 3751854.20560666, 3876198.141369573, 3751873.314863731)
# 20/625709/426119
transform2 = Affine(0.04, 0, 3876159.92, 0, -0.04, 3751873.31)
bounds2 = (3876159.9228554303, 3751835.0963495895, 3876198.141369573, 3751873.314863731)
with WarpedVRT(
src,
src_nodata=None,
crs="EPSG:3857",
width=480,
height=480,
transform=transform1,
add_alpha=True,
) as vrt:
alpha = vrt.read(4, out_shape=(512, 512), window=vrt.window(*bounds1))
print(alpha)
assert(alpha[0][-1] == 0)
with WarpedVRT(
src,
src_nodata=None,
crs="EPSG:3857",
width=960,
height=960,
transform=transform2,
add_alpha=True,
) as vrt:
alpha = vrt.read(4, out_shape=(512, 512), window=vrt.window(*bounds2))
print(alpha)
assert(alpha[0][-1] == 0) |
@mojodna I'm pretty sure it's a conflict because your file has internal nodata value
I ran your file through rio-cogeo and the result looks 👌 |
I'll see if I can track down the original to figure out where that came from. |
Oh, and I'm explicitly stating that |
|
@vincentsarago can you share the generated COG? I'd like to see how it's handling masking. |
@mojodna https://s3-us-west-2.amazonaws.com/remotepixel-pub/cog/internal-mask_cogeo.tif created it just by doing |
Hmm. According to According to rasterio says: https://github.com/mapbox/rasterio/blob/6e1b93d7089c82301dba73e0337a52ba8a44ba6c/rasterio/_base.pyx#L458-L463 But GDAL is apparently doing something else under the hood under these circumstances. |
@mojodna is it correct to say that at zoom level 21, a proper alpha channel is sourced from the internal bitmask, but at zoom 20, the alpha channel is not sourced from the internal bitmask? What is the native resolution of your source image? Does it have overviews? @rouault does this sound like a known (or unknown) GDAL bug? |
I've tried
and the result looks good. |
Correct. I'm treating zoom 21 (w/ 512x512px output) as "native resolution". Native resolution is ~3.3cm (UTM). Image overviews and mask overviews, both.
@vincentsarago |
@rouault try |
OK, I could indeed reproduce the issue with |
@rouault thanks for the diagnosis! 🙏 |
@rouault awesome, thanks. I'm still puzzling through how that might happen; the procedure is here: https://github.com/mojodna/marblecutter-tools/blob/46fed51b65f0f4cb2876cdb01cb214c0575ca64b/bin/transcode.sh#L146-L199 The step that loses the correct mask values is the 2nd |
|
In marblecutter, I've been using external masks with overviews and treating them as normal sources in order to take advantage of the overviews:
https://github.com/mojodna/marblecutter/blob/a4f66cbf7124f25cf15bec8bc0e37ef195119de6/marblecutter/__init__.py#L263-L327
I was thinking that #1361 would eliminate the need for this, but it turns out that the main reason that I'd done that in the first place was that overviews aren't used by
read_masks
(and whenmasked=True
). (This can be observed both by watching a clock (masked reads are noticeably slower) and by enablingCPL_CURL_VERBOSE
when reading remote sources.)The text was updated successfully, but these errors were encountered: