Skip to content

Latest commit

 

History

History
41 lines (31 loc) · 2.13 KB

README-cmip6-zarr.md

File metadata and controls

41 lines (31 loc) · 2.13 KB

Notes on cmip6 zarr work

How do we decide on the chunk size?

We did some testing of efficient object sizes in our object store (Caringo) - it suggested 100Mb - 1Gb was the optimum size. So we have set 250Mb as our target value. Depending on the array shape, our chunks should come in around 250Mb.

Which data frequencies have we covered?

So far, we have the following in our store:

AERday, Amon, CFday, day, Eday, LImon, Lmon, Oday, Omon, Primday

Why develop a package rather than a notebook application?

I see this task as a batch processing task. I want to be able to say "run everything" and let the task manage itself. This is very hard to do, so in reality I need lots of ways to catch failures etc.

These include:

Overall, in my view, this kind of large-scale processing does not fit well with notebooks. I would rather produce a notebook to interact with the Zarr store, on the user side:

https://github.com/cedadev/cmip6-object-store/blob/master/notebooks/cmip6-zarr-jasmin.ipynb