Option to write many rasters for chunked DataArray? #433

TomAugspurger · 2021-11-12T22:02:24Z

Currently, .rio.to_raster will generate a single raster, even for chunked DataArrays. In the case of very large Dask Arrays, it might be more useful to instead write many rasters, perhaps one per chunk. This would better-align with, e.g. dask.DataFrame.to_csv, which writes a single CSV file per partition.

This adds some complexity to how the actual filename is determined, but we can rely on some conventions established in dask / elsewhere to come up with something sensible.

https://discourse.pangeo.io/t/generating-cogs-and-stac-items-from-dataarrays/1913 has some more background information.

This is somewhat related to #432, by providing an alternative that wouldn't need locks.

The text was updated successfully, but these errors were encountered:

snowman2 · 2021-11-12T22:16:06Z

What about this: https://xarray.pydata.org/en/stable/generated/xarray.Dataset.to_zarr.html

GDAL 3.4 added support for Zarr.

Or, are you specifically needing GeoTIff?

TomAugspurger · 2021-11-12T22:29:40Z

In this cases, specifically COGs for interoperability with that toolchain.

snowman2 · 2021-11-13T00:46:36Z

The xcog implementation looks pretty neat 👍.

My initial thoughts:

Would be fun to call the multi-file COG output format czar or czarr 😄
Having it as its own repo like Zarr: https://github.com/zarr-developers/zarr-python, might be helpful for potential adoption of the format in other projects such as GDAL.
Maybe stackstac would be interested in writing a dask xarray to a STAC dataset on disk?
If this gets added: Writeable backends via entrypoints pydata/xarray#5954

Then xcog could be a backend:
```
xds.save_dataset(directory, backend="xcog")
```

TomAugspurger added the proposal Idea for a new feature. label Nov 12, 2021

snowman2 mentioned this issue Dec 2, 2021

speeding up windowed writes #439

Open

scottyhq mentioned this issue Apr 13, 2022

Writing GDAL ZARR _CRS attribute not possible pydata/xarray#6448

Closed

snowman2 added the dask Dask related issue. label Sep 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option to write many rasters for chunked DataArray? #433

Option to write many rasters for chunked DataArray? #433

TomAugspurger commented Nov 12, 2021 •

edited

Loading

snowman2 commented Nov 12, 2021

TomAugspurger commented Nov 12, 2021

snowman2 commented Nov 13, 2021

Option to write many rasters for chunked DataArray? #433

Option to write many rasters for chunked DataArray? #433

Comments

TomAugspurger commented Nov 12, 2021 • edited Loading

snowman2 commented Nov 12, 2021

TomAugspurger commented Nov 12, 2021

snowman2 commented Nov 13, 2021

TomAugspurger commented Nov 12, 2021 •

edited

Loading