Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entrypoints #300

Merged
merged 8 commits into from
Jan 26, 2022
Merged

Entrypoints #300

merged 8 commits into from
Jan 26, 2022

Conversation

martindurant
Copy link
Member

@martindurant martindurant commented Dec 28, 2021

Allows for packages to list codecs in entrypoints, so that the package in question doesn't need to be imported in before the codec can be usable.

Notes:

  • the long list of try import/register in numcodecs/__init__ could be avoided by this.
  • we'll need a way to list all the available codecs
  • How.should this be documented?

Fixes #290

TODO:

  • Unit tests and/or doctests in docstrings
  • tox -e py39 passes locally
  • Docstrings and API docs for any new/modified user-facing classes and functions
  • Changes documented in docs/release.rst
  • tox -e docs passes locally
  • GitHub Actions CI passes
  • Test coverage to 100% (Coveralls passes)

@martindurant
Copy link
Member Author

martindurant commented Dec 29, 2021

A reasonable way to test here would be to include a minimal package and .egg into the repo, and add it to sys.path during a test, checking to see if the basic codec included is automatically retrievable. Is that overkill, should I just skip testing code that is only meant for external packages? It does work for the only codec I have (https://github.com/fsspec/kerchunk/blob/main/kerchunk/grib2.py#L162 with so-far unmerged commit fsspec/kerchunk@a704c90 ).

(done)

@joshmoore
Copy link
Member

Few quick thoughts:

  • This is great.
  • The implementation looks very sane to me.
  • 👍 for the sys.path-based unit test.
  • Having fsspec/kerchunk@a704c90 is great though I wonder:
    • if there's anyway we can have an "integration" test (perhaps by extracting one or more production codecs from here?)
    • and if we should get feedback from other producers (cc: @cgohlke)
  • Finally, is there any issue with order of priority? I assume in the pre-entrypoints world, the fact that the user asserted numcodecs.register_codec at the right point helped to make the right thing happen. Now it would be more an order of the sys.path ordering, no? A partial solution may be to prioritize the numcodec register (Define Registry #278)

@martindurant
Copy link
Member Author

Sorry, @cgohlke , I thought you were already on this thread, but that was, of course, for the linked issue.

if there's anyway we can have an "integration" test

I would rather do it with imagecodecs than kerchunk, the latter being more experimental and (for grib) needing cfgrib in the deps. Once imagecodecs has entrypoints, we can test against them.

is there any issue with order of priority

As things stand, entrypoints are being used as a backup, and anything registered using the existing path will be used first. If two entrypoints share the same name, the entrypoints module will choose one, indeed probably based on where they are in sys.path .

In Intake, we have warnings and clobber=bool for registering plugins, and also warn if entrypoints clash - but I don't think it's actually ended up being useful to do that. At most, I would suggest adding logging to the process, so a user can figure out what happened.

@joshmoore
Copy link
Member

I would rather do it with imagecodecs than kerchunk,

👍

I would suggest adding logging to the process, so a user can figure out what happened.

👍

@martindurant
Copy link
Member Author

@joshmoore , added a couple of basic logger statements. Integration testing can wait on imagecodecs.

@cgohlke
Copy link

cgohlke commented Jan 10, 2022

The imagecodecs package is still experimental/unstable. For now I published a separate package on PyPI with numcodecs entry points at imagecodecs-numcodecs:

import inspect
import pprint

from setuptools import setup

import imagecodecs
import imagecodecs.numcodecs

entry_points = [
    f'{cls.codec_id} = imagecodecs.numcodecs:{name}'
    for name, cls in inspect.getmembers(imagecodecs.numcodecs)
    if hasattr(cls, 'codec_id') and name != 'Codec'
]

pprint.pprint(entry_points)

setup(
    name='imagecodecs-numcodecs',
    version=imagecodecs.__version__,
    description='Numcodecs entry points for the imagecodecs package',
    # ...
    install_requires=['numcodecs', f'imagecodecs=={imagecodecs.__version__}'],
    entry_points={'numcodecs.codecs': entry_points},
)

Entry points:

[numcodecs.codecs]
imagecodecs_aec = imagecodecs.numcodecs:Aec
imagecodecs_avif = imagecodecs.numcodecs:Avif
imagecodecs_bitorder = imagecodecs.numcodecs:Bitorder
imagecodecs_bitshuffle = imagecodecs.numcodecs:Bitshuffle
imagecodecs_blosc = imagecodecs.numcodecs:Blosc
imagecodecs_blosc2 = imagecodecs.numcodecs:Blosc2
imagecodecs_brotli = imagecodecs.numcodecs:Brotli
imagecodecs_bz2 = imagecodecs.numcodecs:Bz2
imagecodecs_cms = imagecodecs.numcodecs:Cms
imagecodecs_deflate = imagecodecs.numcodecs:Deflate
imagecodecs_delta = imagecodecs.numcodecs:Delta
imagecodecs_float24 = imagecodecs.numcodecs:Float24
imagecodecs_floatpred = imagecodecs.numcodecs:FloatPred
imagecodecs_gif = imagecodecs.numcodecs:Gif
imagecodecs_jpeg = imagecodecs.numcodecs:Jpeg
imagecodecs_jpeg2k = imagecodecs.numcodecs:Jpeg2k
imagecodecs_jpegls = imagecodecs.numcodecs:JpegLs
imagecodecs_jpegxl = imagecodecs.numcodecs:JpegXl
imagecodecs_jpegxr = imagecodecs.numcodecs:JpegXr
imagecodecs_lerc = imagecodecs.numcodecs:Lerc
imagecodecs_ljpeg = imagecodecs.numcodecs:Ljpeg
imagecodecs_lz4 = imagecodecs.numcodecs:Lz4
imagecodecs_lz4f = imagecodecs.numcodecs:Lz4f
imagecodecs_lzf = imagecodecs.numcodecs:Lzf
imagecodecs_lzma = imagecodecs.numcodecs:Lzma
imagecodecs_lzw = imagecodecs.numcodecs:Lzw
imagecodecs_packbits = imagecodecs.numcodecs:PackBits
imagecodecs_pglz = imagecodecs.numcodecs:Pglz
imagecodecs_png = imagecodecs.numcodecs:Png
imagecodecs_rcomp = imagecodecs.numcodecs:Rcomp
imagecodecs_snappy = imagecodecs.numcodecs:Snappy
imagecodecs_spng = imagecodecs.numcodecs:Spng
imagecodecs_tiff = imagecodecs.numcodecs:Tiff
imagecodecs_webp = imagecodecs.numcodecs:Webp
imagecodecs_xor = imagecodecs.numcodecs:Xor
imagecodecs_zfp = imagecodecs.numcodecs:Zfp
imagecodecs_zlib = imagecodecs.numcodecs:Zlib
imagecodecs_zlibng = imagecodecs.numcodecs:Zlibng
imagecodecs_zopfli = imagecodecs.numcodecs:Zopfli
imagecodecs_zstd = imagecodecs.numcodecs:Zstd

Some codecs are read-only or may be non-functional depending on the binary distribution (raising delayed ImportError on use).

The codec names are prefixed with imagecodecs_. It might be necessary to version the codecs in case their API changes...

@martindurant
Copy link
Member Author

Interesting idea! So we can just include "imagecodecs-numcodecs" in requirements, and get both the codecs themselves and the entrypoints for auto import. I'm happy with that.
Any further changes we should do here, then?

@martindurant
Copy link
Member Author

re:

It might be necessary to version the codecs in case their API changes...

There is no provision for this anywhere in numcodecs. I suppose that if you were writing something that depended on a specific version of a codec, such as an intake catalog in a repo of its own, you can specify version requirements there.

@martindurant
Copy link
Member Author

ping to all

@joshmoore
Copy link
Member

Sorry, was off collecting other use cases. As long as there is nothing we need to consider about the more complex use cases like versions upfront, then I'm still for moving forward, though perhaps as a 0.10.0.a1.

@cgohlke: is there anything you would need from your side?

@jakirkham
Copy link
Member

cc @thomcom ( as this may be of interest for hooking in GPU compressors zarr-developers/community#19 (comment) )

@joshmoore
Copy link
Member

Considering that there are no pressing objections, and getting 0.10.0.a1 out for others to consume.

@jakirkham
Copy link
Member

Thanks Martin for the PR and everyone for the review! 😄

To one of the questions Martin raised in the OP about docs, filed issue ( #302 ) to discuss & track

The other question that occurred to me is what happens if an extension registers using an existing codec ID. Do we do anything about that currently? No worries if not. Just wanted to check. We can move this out to an issue if more discussion is needed 🙂

@martindurant martindurant deleted the entrypoints branch January 26, 2022 14:14
@martindurant
Copy link
Member Author

what happens if an extension registers using an existing codec ID

Programmatic registration takes precedence, entrypoints lookup is the fallback. So this should cause no surprises.

@jakirkham
Copy link
Member

Ok so 3rd party modules can't override existing extensions (at least not without a user doing something explicit)?

@martindurant
Copy link
Member Author

Ok so 3rd party modules can't override existing extensions (at least not without a user doing something explicit)?

Correct: here, we check the registry first, which will include any entries from register_codec, which is how numcodecs itself puts codecs into the registry, without checking whether this clobbers a codec name.

@jakirkham
Copy link
Member

Great, thanks Martin! 😄

joshmoore added a commit to joshmoore/numcodecs that referenced this pull request Jul 5, 2022
@joshmoore joshmoore mentioned this pull request Jul 5, 2022
7 tasks
joshmoore added a commit that referenced this pull request Jul 26, 2022
* Add entrypoints to setup.py

see:
 - conda-forge/numcodecs-feedstock#86
 - #300

* Check syspath and ignore error

* Temporarily disable codec test on Windows

* Try adding entry test to manifest

* Add `entrypoints` to `requirements.txt`

* Add `entrypoints` to `requirements_dev.txt`

* Remove `entrypoints` from `requirements_test.txt`

* Add `entrypoints` to `requirements_rtfd.txt`

* Note `numcodecs` is not `zip_safe`

* xfail the test completely

Co-authored-by: jakirkham <jakirkham@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support entrypoints?
4 participants