-
-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thinning down zarr-python
#1274
Comments
Yeah we have had similar discussions before |
Thanks for linking the previous discussions, @jakirkham. Some interesting points there. Also, we have PyData Global 2022 Sprints coming up next week so we're thinking to kick this off over there. |
The main thing to consider is currently we are testing these Stores pretty regularly (particularly with the general code here and dependency updates). Would be good to handle this somehow when refactoring (cron jobs on CI, running with all Store extensions installed here, etc.) |
Updates to the description:
To #1274 (comment), @jakirkham, I'd say with all the work on easing releases, we should really push for faster iterations, and then have dependabot test each of the refactor stores (daily?) |
If the interface between the Zarr core and the store layer is clearly and cleanly defined, it should in theory be sufficient to test each store thoroughly. If there are edge cases that only arise when testing a store against the full Zarr test suite, that might indicate leaky abstractions. (Partial chunk decompression comes to mind as a sketchy area.) In any case, we should probably be setting up more integration testing of the entire stack, including testing against real cloud storage. I would be happy to work on that. |
I think this is far more important than trimming the zarr-python codebase! Would it be simpler to move the not-so-used storage classes into a contrib/ directory, as a mark that they are not maintained by the core team? I definitely think we need to keep in all the stores that may be instantiated directly in the main open_* functions (DirStore, FSStore, ConsolidatedStore, ...?). |
Yeah we had a similar discussion about refactor contents out within Zarr-Python as different modules and adding code owners ( #764 ) Honestly agree this may be an easier first step to address the issue identified. Once complete this change may be useful for further library refactoring. It may also be easier for a new contributor to try than setting up a new repo On a different note perhaps we should also think about an entry point mechanism like what was done in Numcodecs ( zarr-developers/numcodecs#300 ). This may make it easier for developers wanting to add their own Store without needing changes in Zarr-Python |
I'd be happy seeing the n5 functionality pulled out into its own package. |
More info here: https://github.com/zarr-developers/cookiecutter-zarr-store |
see https://github.com/zarr-developers/n5py (private atm) for a start |
I've started taking steps to deprecate many of the more exotic stores. See #1756. |
Hi, I bumped into this when getting a deprecation warning. How can we use |
@JoOkuma - in v3, the nested directory store layout will be supported through the |
This came up during one of the conversations with @joshmoore. The idea is to extract functionalities not in V2 Spec and their respective tests from zarr-python. E.g. Zarr Specification V2 has the following stores:
But in
zarr-python
, we have additional stores like SQLiteStore, MongoDBStore etc. If we could remove these additional stores from thezarr-python
codebase and host them in a separate repo under zarr-developers, it'll help us:If a user/developer wishes to use the removed functionality, they can shoot
pip install zarr-sqlitestore
or similar.FYI: A similar repo is in the works for Numcodecs, which is taken care of by @joshmoore and @jakirkham.
Please let me know your thoughts on this. Thanks!
CC: @zarr-developers/python-core-devs
The text was updated successfully, but these errors were encountered: