Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support of alternative array classes: ndarray-like #305

Merged
merged 12 commits into from
Mar 15, 2022

Conversation

madsbk
Copy link
Contributor

@madsbk madsbk commented Feb 9, 2022

This PR implements support of alternative array types (other than NumPy arrays) by introducing the concept ndarray-like.

The idea is to differentiate between the need of a numpy.ndarray and an array that behaves like a NumPy array, such as a CuPy array. Calling the old compat functions will still return regular NumPy arrays but two new compat functions now accept any NumPy array-like arrays:

# Ensure `buf` is a NumPy array
def ensure_ndarray(buf) -> np.ndarray: ...
def ensure_contiguous_ndarray(buf, max_buffer_size=None) -> np.array: ...

# Ensure `buf` is a NumPy array-like
def ensure_ndarray_like(buf) -> NDArrayLike:
def ensure_contiguous_ndarray_like(buf, max_buffer_size=None) -> NDArrayLike:

NumPy array like

I suggest that we use Python3.8's typing.Protocol to define what we consider ndarray-like:

@runtime_checkable
class DType(Protocol):
    itemsize: int
    name: str
    kind: str

@runtime_checkable
class FlagsObj(Protocol):
    c_contiguous: bool
    f_contiguous: bool
    owndata: bool

@runtime_checkable
class NDArrayLike(Protocol):
    dtype: DType
    shape: Tuple[int, ...]
    strides: Tuple[int, ...]
    ndim: int
    size: int
    itemsize: int
    nbytes: int
    flags: FlagsObj
    def __len__(self) -> int: ...
    def __getitem__(self, key) -> Any: ...
    def __setitem__(self, key, value): ...
    def tobytes(self, order: Optional[str] = ...) -> bytes: ...
    def reshape(self, *shape: int, order: str = ...) -> "NDArrayLike": ...
    def view(self, dtype: DType = ...) -> "NDArrayLike": ...

This requires Python v3.8. If this is a problem, we can use the typing_extensions module as a fallback:

if sys.version_info >= (3, 8):
    from typing import Protocol, runtime_checkable
else:
    from typing_extensions import Protocol, runtime_checkable

Motivation

This is the first step towards support of CuPy arrays in Zarr.

TODO:

  • Closes WIP: Allow CuPy #212 and Coerce other data to bytes in ensure_text #213
  • Unit tests and/or doctests in docstrings
  • tox -e py39 passes locally
  • Docstrings and API docs for any new/modified user-facing classes and functions
  • Changes documented in docs/release.rst
  • tox -e docs passes locally
  • GitHub Actions CI passes
  • Test coverage to 100% (Coveralls passes)

cc. @jakirkham

@madsbk madsbk changed the title Support of alternative array classes: _ndarray-like_ Support of alternative array classes: ndarray-like Feb 10, 2022
@madsbk madsbk marked this pull request as ready for review February 10, 2022 11:15
@jakirkham
Copy link
Member

@joshmoore do you have thoughts here? 🙂

Copy link
Member

@joshmoore joshmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beyond one minor question on my part (and @jakirkham's about installing typing_extensions, it looks good. The plan here would be to try to consume it from zarr-developers/zarr-python#934?

numcodecs/compat.py Show resolved Hide resolved
@rabernat
Copy link
Contributor

I don't have any thing to add to the review, but just wanted to chime in to say I think this is a great feature. Supporting different array-like things is really important for future interoperability of zarr / numcodecs.

👏

@madsbk
Copy link
Contributor Author

madsbk commented Feb 23, 2022

Anything else we need in this PR? Else, I think it is ready to be merged.

@madsbk
Copy link
Contributor Author

madsbk commented Feb 25, 2022

@joshmoore, @jakirkham anything else we need to do here? I don't think the CI error relates to this PR.

@jakirkham
Copy link
Member

Regarding CI. Fixing in PR ( #311 )

@jakirkham
Copy link
Member

Thanks for working on this Mads! 😄 Also appreciate Ryan and Josh taking the time to look.

Would you be able to fix up the merge conflicts, Mads? 🙂

Generally LGTM. Though would be great if someone other than me merged (assuming people have no further questions)

@madsbk
Copy link
Contributor Author

madsbk commented Mar 2, 2022

Would you be able to fix up the merge conflicts, Mads? slightly_smiling_face

Sure, it is merged :)

@jakirkham
Copy link
Member

Thanks Mads! 😀

@joshmoore or @rabernat if you are comfortable with the changes here, please feel free to merge (if not please let us know what is still needed)? Thanks!

madsbk added a commit to madsbk/zarr-python that referenced this pull request Mar 4, 2022
madsbk added a commit to madsbk/zarr-python that referenced this pull request Mar 4, 2022
@joshmoore
Copy link
Member

Nothing else from my side. Thoughts on getting this out as either 0.10 or is there a preference for another pre-release?

(@MSanKeys963 would likely also ask if a blog post is in order :) If so, text blurbs welcome)

@MSanKeys963
Copy link
Member

(@MSanKeys963 would likely also ask if a blog post is in order :) If so, text blurbs welcome)

Happy to write a blog post on this as this focuses on interoperability.

@madsbk would you like to help me with the blog post? If yes, I can post a few questions and you can answer them.

@madsbk
Copy link
Contributor Author

madsbk commented Mar 8, 2022

(@MSanKeys963 would likely also ask if a blog post is in order :) If so, text blurbs welcome)

Happy to write a blog post on this as this focuses on interoperability.

@madsbk would you like to help me with the blog post? If yes, I can post a few questions and you can answer them.

Sure, that sounds great!

@joshmoore
Copy link
Member

👍 all around. Any thoughts on the release style?

@madsbk
Copy link
Contributor Author

madsbk commented Mar 11, 2022

+1 all around. Any thoughts on the release style?

I think including it in the v0.10 release is good but can we merge it now? It will help me finishing zarr-developers/zarr-python#934.

@jakirkham
Copy link
Member

jakirkham commented Mar 11, 2022

Ok with either release style. No strong preferences. This is more internally facing (though would be used by Zarr too) and the same APIs Zarr uses are present. So would expect things to still work

Edit: Also sorry for being somewhat quiet here, was out sick most of the week.

@joshmoore joshmoore merged commit bedb8b0 into zarr-developers:master Mar 15, 2022
@jakirkham
Copy link
Member

Thanks all! 😄

@madsbk
Copy link
Contributor Author

madsbk commented Mar 15, 2022

Thanks!

@madsbk madsbk deleted the ndarray_like branch March 15, 2022 15:17
madsbk added a commit to madsbk/zarr-python that referenced this pull request Mar 15, 2022
@joshmoore
Copy link
Member

Building now as v0.10.0.a3

@jakirkham
Copy link
Member

Thanks Josh! 😄

@joshmoore joshmoore mentioned this pull request Apr 13, 2022
7 tasks
joshmoore added a commit to zarr-developers/zarr-python that referenced this pull request Sep 8, 2022
* Implement CuPyCPUCompressor and the meta_array argument

This is base of
<https://github.com/jakirkham/zarr-python/tree/use_array_like>

Co-authored-by: John Kirkham <jakirkham@gmail.com>

* Adding meta_array to open_group()

* CuPyCPUCompressor: clean up and doc

* clean up

* flake8

* mypy

* Use KVStore when checking for in-memory data

Checking against MutableMapping categories all BaseStores as in-memory
stores.

* group: the meta_array argument is now used for new arrays

* flake8

* Use empty_like instead of empty

Co-authored-by: jakirkham <jakirkham@gmail.com>

* More use of NumPy's *_like API

Co-authored-by: jakirkham <jakirkham@gmail.com>

* Assume that array-like objects that doesn't have a `writeable` flag is writable.

* _meta_array: use shape=()

Co-authored-by: jakirkham <jakirkham@gmail.com>

* use ensure_ndarray_like() and ensure_contiguous_ndarray_like()

* CI: use zarr-developers/numcodecs#305

* Removed unused code

Co-authored-by: Tobias Kölling <tobi@die70.de>

* CI: changed minimal NumPy version to v1.20

* CI: use numpy>=1.21.* for `mypy` check

* Revert "CI: use zarr-developers/numcodecs#305"

This reverts commit 9976f06.

* fix merge mistake

* CI: remove manual numpy install

* pickle: use kwargs

* moved CuPyCPUCompressor to the test suite

* doc-meta_array: changed to versionadded:: 2.13

* test_cupy: assert meta_array

* test_cupy: test when CuPy isn't available

* renamed: test_cupy.py -> test_meta_array.py

* removed ensure_cls()

* Added "# pragma: no cover" to the CuPyCPUCompressor test class

Co-authored-by: John Kirkham <jakirkham@gmail.com>
Co-authored-by: Josh Moore <j.a.moore@dundee.ac.uk>
Co-authored-by: Tobias Kölling <tobi@die70.de>
Co-authored-by: Gregory Lee <grlee77@gmail.com>
enthusiastdev121 added a commit to enthusiastdev121/zarr-python that referenced this pull request Aug 19, 2024
* Implement CuPyCPUCompressor and the meta_array argument

This is base of
<https://github.com/jakirkham/zarr-python/tree/use_array_like>

Co-authored-by: John Kirkham <jakirkham@gmail.com>

* Adding meta_array to open_group()

* CuPyCPUCompressor: clean up and doc

* clean up

* flake8

* mypy

* Use KVStore when checking for in-memory data

Checking against MutableMapping categories all BaseStores as in-memory
stores.

* group: the meta_array argument is now used for new arrays

* flake8

* Use empty_like instead of empty

Co-authored-by: jakirkham <jakirkham@gmail.com>

* More use of NumPy's *_like API

Co-authored-by: jakirkham <jakirkham@gmail.com>

* Assume that array-like objects that doesn't have a `writeable` flag is writable.

* _meta_array: use shape=()

Co-authored-by: jakirkham <jakirkham@gmail.com>

* use ensure_ndarray_like() and ensure_contiguous_ndarray_like()

* CI: use zarr-developers/numcodecs#305

* Removed unused code

Co-authored-by: Tobias Kölling <tobi@die70.de>

* CI: changed minimal NumPy version to v1.20

* CI: use numpy>=1.21.* for `mypy` check

* Revert "CI: use zarr-developers/numcodecs#305"

This reverts commit 9976f067a42c99b37e2c1a57ef7377b0b34f9318.

* fix merge mistake

* CI: remove manual numpy install

* pickle: use kwargs

* moved CuPyCPUCompressor to the test suite

* doc-meta_array: changed to versionadded:: 2.13

* test_cupy: assert meta_array

* test_cupy: test when CuPy isn't available

* renamed: test_cupy.py -> test_meta_array.py

* removed ensure_cls()

* Added "# pragma: no cover" to the CuPyCPUCompressor test class

Co-authored-by: John Kirkham <jakirkham@gmail.com>
Co-authored-by: Josh Moore <j.a.moore@dundee.ac.uk>
Co-authored-by: Tobias Kölling <tobi@die70.de>
Co-authored-by: Gregory Lee <grlee77@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants