-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support multi-band COGs #62
Comments
You could probably parse both |
@gjoseph92 implementation-wise, do you have a sense for which of these would be preferable?
If I had to guess, 1 sounds a bit easier to shoehorn into the current setup, but I'm not confident in that. |
Thanks for opening this @TomAugspurger. It's true that 1 might be easier, but I think we should go with some form of 2 for a few reasons. I had always intended Readers to support multi-band; I just cut that corner to get the first release out. Depending on the file format, it may not be possible to read the individual bands within an asset separately. For example by default, GeoTIFFs are pixel-interleaved. Therefore, we really don't want to have a separate "virtual" asset per band, because then fetching any of R, G, or B would require reading RGB—and trying to fetch R, G, and B would triple-read the data (GDAL block-caching aside; we don't want to rely on that too much) and open triple datasets, which will then be further copied per thread! So basically, we need dask's chunking along the bands to match up with how the data is actually physically chunked. In the case of Sentinel-2, if we asked stackstac for Of course non-GeoTIFF formats (or band-interleaved GeoTIFFs) might support efficient sub-selection by band, but since stackstac primarily cares about STAC best practices right now, and GeoTIFF is STAC best practice, I think it'd be reasonable to start by assuming assets don't support efficient band sub-selection. So we'll say we always have one chunk per asset, and the length of that chunk equals the number of bands in that asset. Then, I think we'd do something like:
|
Are there any new developments concerning this? I wonder if there is a workaround that allows us to work with multiband data (apart from creating single band copies). Or can we simply input single band VRTs pointing to the multiband data, without loosing to much performance? |
Unless @TomAugspurger has been working on it, there are no updates. Updating the dask and rasterio side of things to handle multi-band assets is comparatively straightforward. The main thing I haven't wanted to deal with is parsing the various places in STAC where the number of bands per asset can be defined. There isn't a workaround for this that I know of. If you want to take a stab at it yourself, contributions are always welcome! |
@gjoseph92 thank you for your answer. I tried it out: Working with single-band VRTs pointing to multiband COGs worked well for me, allowing to use stackstac in the intended way, i.e. single-band raster inputs. Also it seems to have no impact on processing speed (concluding from a small, limited comparative experiement). |
@scotjohn mind sharing an example of how you set up the VRTs? Did you rewrite the STAC and create your own asset files, which were actually VRTs pointing to the original assets? |
I haven’t done any work on this. I’d also be curious to see an example
using VRTs.
…On Mar 16, 2022 at 12:31:44 PM, Gabe Joseph ***@***.***> wrote:
@scotjohn <https://github.com/scotjohn> mind sharing an example of how
you set up the VRTs? Did you rewrite the STAC and create your own asset
files, which were actually VRTs pointing to the original assets?
—
Reply to this email directly, view it on GitHub
<#62 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKAOIUSX36WZWHBTN2ROFDVAILIBANCNFSM47GBPUDQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Yes, it is as you described. I work with 4-band PlanetFusion data that are available as multiband COGs. For each COG I create four VRTs, i.e. one per band, which become the assets. I have written the STAC from scratch. So I am not sure how this fits into the problem of existing STACs and public data catalogs. |
To generate VRTs from COGs: def create_bandwise_vrt(in_file: Path, out_dir: Path, n_bands: int):
for i in range(1, n_bands + 1):
out_vrt = Path(out_dir, in_file.stem + f"_B{i:02}.vrt")
in_files = [str(in_file)]
vrt_options = gdal.BuildVRTOptions(
bandList=[i]
)
gdal.BuildVRT(str(out_vrt), in_files, options=vrt_options) |
Hi @gjoseph92 , |
@julianblue my current job doesn't allocate time for me to work on stackstac, so there's no timeline for this happening. I'm opening to reviewing PRs though. |
This is mentioned in the README, but I thought I'd open an issue that I can link to.
Currently, multi-band cogs are not supported by
stackstac
.That COG has 4 bands, so the shape should be
(1, 4, 12240, 11040)
.I started a branch, but haven't had a chance to finish it off. I'll post here if / when I pick it up again.
The text was updated successfully, but these errors were encountered: