Skip to content
This repository has been archived by the owner on Oct 24, 2024. It is now read-only.

to_zarr() overwrites by default #274

Closed
nicrie opened this issue Nov 8, 2023 · 3 comments · Fixed by #275
Closed

to_zarr() overwrites by default #274

nicrie opened this issue Nov 8, 2023 · 3 comments · Fixed by #275
Labels
IO Representation of particular file formats as trees

Comments

@nicrie
Copy link

nicrie commented Nov 8, 2023

Currently, the to_zarr() method's mode parameter defaults to w which is (imo) not ideal.

Picture this: an optimistic user (who's totally not me) thinks, "Hey, to_zarr(./path/to/everything/I/hold/dear/) will just add my zarr data to my collection of digital life achievements." Instead, it ruthlessly purges all within the specified directory, leaving nothing but the zarr dataset in its wake.

Suggestion: Switch the default to mode="w-". Yes, one could still choose to unleash chaos upon their files, but it should be a conscious choice - not an "oops" moment. ;)

@slevang
Copy link
Contributor

slevang commented Nov 8, 2023

Will second this!

Context: I just implemented a method for serializing models in xeofs, which involved a lot of complex nested structures, for which datatree and zarr were perfect tools. Thank you for the great package!

However, was very surprised to realize datatree breaks with xarray on the default zarr write mode. I don't see any obvious reason for it, and it may not even be intentional. to_zarr(mode="w") is potentially much more destructive than to_netcdf(mode="w"), because the former can rm -r an entire directory, whereas the latter can only remove a single file. Hence the safer default of to_zarr(mode="w-").

@TomNicholas
Copy link
Member

Instead, it ruthlessly purges all within the specified directory, leaving nothing but the zarr dataset in its wake.

I'm sorry that happened @nicrie !

it may not even be intentional

I don't remember making an active choice about this myself, and the proposed change in #275 seems fine to me. @jhamman is there any subtlety here I'm missing?

@TomNicholas TomNicholas added the IO Representation of particular file formats as trees label Nov 9, 2023
@nicrie
Copy link
Author

nicrie commented Nov 9, 2023

No problem @TomNicholas , nothing much happened! Thanks for the cool package, by the way! :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
IO Representation of particular file formats as trees
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants