-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v0.5.0 Array overhaul #13
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
allow integer indexing to eliminate dimensions
`MetaData` class now needs to take a `shape` argument to unambiguously determine the number of physical/channel dimensions.
after discussions with Caroline, lazy_op seems more descriptive. Especially since we allow almost arbitrary lazy operations such as adding 1, threhsolding etc. as well as slicing. renamed everything related to adapters
always use "/" as a delimiter always trim offset/voxel_size/units to drop values that line up with channel dimensions.
Not needed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The main changes:
funlib.persistence.Array
, and then slice the first 5 time steps (assuming your data has time in the first channel. You can also apply functions such as:thresholded_array = open_ds("path/to/data.zarr/array").adapt(lambda x: x > 0.5)
which will lazily apply the function and will appropriately update thethresholded_array.dtype
soassert thresholded_array.dtype == bool
should pass. You can write to the array if you only use slicing operations, but once you apply a function to your data it will no longer be writable. Arrays are now backed bydask
so our support extends to but is also limited by the lazy slicing and processing thatdask
supports.open_ds
andprepare_ds
take a single store argument. This is directly passed tozarr.open
, so we now both expand our support to anything zarr supports (zipped stores, cloud stores, etc.) but also limit ourselves (no more hdf5 etc.). Note this limitation only applies to the convenience functionsopen_ds
andprepare_ds
which come with expectations on data format and metadata format.Array
will still work with any array like object that can be converted to adask.Array
withdask.from_array
. If your data does not match our priors, we recommend writing customopen_ds
andprepare_ds
alternativestotal_roi
andnum_channels
when usingprepare_ds
or directly callingArray
. We now just passoffset
(in units defined by the "units" attribute) andshape
(voxels). This means we now support any number of channel dimensions. I.e. you can doprepare_ds(..., offset = (100,200,300), shape = (3, 3, 300, 300, 300))
to have 2 channel dimensions and 3 physical which previously wouldn't have been straightforwardaxis_names
,units
,voxel_size
, andoffset
. I have separated out a metadata class and a metadata parsing class that can be modified to cover a fairly large variety of simple metadata schemes, and added some reasonable defaults so this metadata will always be present or errors will be thrown if metadata is contradictory. If your metadata requires special parsing (e.g. you store your metadata on the multiscale group instead of directly on the array you are opening) then it is easy to pass in metadata fields to skip the automatic parsing so you can write your own thin wrapper for your specific data."pyproject.toml"
,"funlib_persistence.toml"
,Path.home() / ".config/funlib_persistence/funlib_persistence.toml"
,"/etc/funlib_persistence/funlib_persistence.toml"
for configs. The attributes that can be provided arevoxel_size_attr
,axis_names_attr
,units_attr
, andoffset_attr
. Whatever attributes you provide will be used for both reading and writing metadata. You can also override the default metadata in each python script viafunlib.persistence.arrays.metadata.set_default_metadata_format(...)
.