Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python/ci] typeguard==4.2.1, make requirements_dev.txt canonical #2314

Merged
merged 3 commits into from
Mar 26, 2024

Conversation

ryan-williams
Copy link
Member

@ryan-williams ryan-williams commented Mar 25, 2024

Issue and/or context: #2312

Changes:

  • use requirements_dev.txt in CI, setup.py: c47fa2a
  • use typeguard>=4.2, fixup some type checks: 74c530b

Notes for Reviewer:
requirements_dev.txt seemed to be unused. I:

  • populated requirements_dev.txt with the actual [dev] reqs from setup.py (plus a few that were only listed in CIs: pytest-cov, sparse)
  • made setup.py use the new reqs file
  • made CIs install -e apis/python[dev] (they were previously installing sans [dev], and pre-installing a few select dev reqs)

Previously the typeguard pin existed in 4 places, for no good reason (afaict).

Let me know if there were actually two distinct sets of "dev" reqs here, which I am now conflating.

Copy link

codecov bot commented Mar 25, 2024

Codecov Report

Merging #2314 (5e750c1) into main (9d7d407) will increase coverage by 0.06%.
Report is 1 commits behind head on main.
The diff coverage is 100.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2314      +/-   ##
==========================================
+ Coverage   78.63%   78.69%   +0.06%     
==========================================
  Files         140      140              
  Lines       10762    10760       -2     
  Branches      217      217              
==========================================
+ Hits         8463     8468       +5     
+ Misses       2199     2193       -6     
+ Partials      100       99       -1     
Flag Coverage Δ
libtiledbsoma 67.40% <ø> (+0.27%) ⬆️
python 90.58% <100.00%> (-0.01%) ⬇️
r 74.69% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
python_api 90.58% <100.00%> (-0.01%) ⬇️
libtiledbsoma 48.34% <ø> (ø)

@ryan-williams ryan-williams marked this pull request as ready for review March 25, 2024 17:54
Copy link
Member

@johnkerl johnkerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@ryan-williams
Copy link
Member Author

@thetorpedodog, you had good feedback on #1960, wanted to give you an opportunity to comment here, in case anything jumps out.

timestamp=timestamp,
threadpool=threadpool,
timestamp=cast(Optional[OpenTimestamp], timestamp),
threadpool=cast(Optional[ThreadPoolExecutor], threadpool),
Copy link
Member Author

@ryan-williams ryan-williams Mar 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprised I had to resort to casting here. I thought mypy would understand that the ifs on L265-L270 above should remove SENTINEL from the timestamp and threadpool Unions, but mypy was erroring.

Previously, the # type: ignore[assignment]s on L224-L225 were working around this issue. Localizing the issue here seems cleaner, but open to ideas.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can't infer that _SENTINEL is the only instance of a Type that we will accept; i.e., it thinks that there might be some other type, like class MyBogusSentinel(_soma_tiledb_context.SENTINEL): pass. This would mean that a user could pass ctx.replace(timestamp=MyBogusSentinel), and doing so is valid within the constraints of the type system.

To do this without requiring changes to the type-checker would require the if whatever is _SENTINEL lines to be changed to if isinstance(whatever, type) (since type is the type of the _SENTINEL class-value).

I think there are a few reasonable solutions here:

  1. Change the if-statement to check whether the value in question is a type (since _SENTINEL is a class rather than an instance of the class):

    def some_func(
        very_optional_param: Union[None, int, Type[_SENTINEL]] = _SENTINEL,
    ) -> ...:
      if isinstance(very_optional_param, type):
        very_optional_param = int_from_somewhere()
      # All types are gone from possible values of very_optional_param
      # leaving it as an Optional[int]

    (You can’t do issubclass(very_optional_param, _SENTINEL) because the first parameter to issubclass must be a class itself.)

  2. Use an instance of a sentinel type as a default value, rather than the type itself. This might look like:

    # this is an empty attrs class so it is immutable
    @attrs.define()
    class _Unset:
      pass
    
    def some_func(
        very_optional_param: Union[None, int, _Unset] = _Unset(),
    ) -> ...:
      if isinstance(very_optional_param, _Unset):
        very_optional_param = int_from_somewhere()
      # now very_optional_param is definitely an Optional[int]
      # since all possible _Sentinels are gone
      ...
  3. A simpler solution for this particular case: Make the default value a simple literal that doesn’t belong in any of the other types.

    # My thinking behind defining these literals and constants:
    # We can document that people shouldn't be relying on
    # the exact value "__unset__" if they use the API.
    _Unset = Literal["__unset__"]
    _UNSET: _Unset = "__unset__"
    
    def some_func(
        very_optional_param: Union[None, int, _Unset] = _UNSET,
    ) -> ...:
      if very_optional_param == _UNSET:
        very_optional_param = int_from_somewhere()
      # "__unset__" is also now gone from the possible values
      # of very_optional_param.

I actually like the latter, since it means we just use types and instances that are already there, doesn’t have the weirdness of Type[_SENTINEL], and we can always upgrade it to the immutable-class-instance–based system by changing internal code without affecting the appearance of the signature.

It would be neat if Literal accepted just a raw object:

_SENTINEL = object()
_LiteralSentinel = Literal[_SENTINEL]

to mean that you could only pass the literal object my_module._SENTINEL the function, but it doesn’t (for sensible reasons). But if it did, we could avoid all this!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @thetorpedodog, great explanation!

I've updated this to use _Unset/_UNSET, per your example. Typeguard and mypy seem happy with it, and it seems cleaner than my previous cast-based workaround.

Copy link
Contributor

@thetorpedodog thetorpedodog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you how you want to handle the unset sentinel thing; I have three possible ideas below and you can choose if you like one.

timestamp=timestamp,
threadpool=threadpool,
timestamp=cast(Optional[OpenTimestamp], timestamp),
threadpool=cast(Optional[ThreadPoolExecutor], threadpool),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can't infer that _SENTINEL is the only instance of a Type that we will accept; i.e., it thinks that there might be some other type, like class MyBogusSentinel(_soma_tiledb_context.SENTINEL): pass. This would mean that a user could pass ctx.replace(timestamp=MyBogusSentinel), and doing so is valid within the constraints of the type system.

To do this without requiring changes to the type-checker would require the if whatever is _SENTINEL lines to be changed to if isinstance(whatever, type) (since type is the type of the _SENTINEL class-value).

I think there are a few reasonable solutions here:

  1. Change the if-statement to check whether the value in question is a type (since _SENTINEL is a class rather than an instance of the class):

    def some_func(
        very_optional_param: Union[None, int, Type[_SENTINEL]] = _SENTINEL,
    ) -> ...:
      if isinstance(very_optional_param, type):
        very_optional_param = int_from_somewhere()
      # All types are gone from possible values of very_optional_param
      # leaving it as an Optional[int]

    (You can’t do issubclass(very_optional_param, _SENTINEL) because the first parameter to issubclass must be a class itself.)

  2. Use an instance of a sentinel type as a default value, rather than the type itself. This might look like:

    # this is an empty attrs class so it is immutable
    @attrs.define()
    class _Unset:
      pass
    
    def some_func(
        very_optional_param: Union[None, int, _Unset] = _Unset(),
    ) -> ...:
      if isinstance(very_optional_param, _Unset):
        very_optional_param = int_from_somewhere()
      # now very_optional_param is definitely an Optional[int]
      # since all possible _Sentinels are gone
      ...
  3. A simpler solution for this particular case: Make the default value a simple literal that doesn’t belong in any of the other types.

    # My thinking behind defining these literals and constants:
    # We can document that people shouldn't be relying on
    # the exact value "__unset__" if they use the API.
    _Unset = Literal["__unset__"]
    _UNSET: _Unset = "__unset__"
    
    def some_func(
        very_optional_param: Union[None, int, _Unset] = _UNSET,
    ) -> ...:
      if very_optional_param == _UNSET:
        very_optional_param = int_from_somewhere()
      # "__unset__" is also now gone from the possible values
      # of very_optional_param.

I actually like the latter, since it means we just use types and instances that are already there, doesn’t have the weirdness of Type[_SENTINEL], and we can always upgrade it to the immutable-class-instance–based system by changing internal code without affecting the appearance of the signature.

It would be neat if Literal accepted just a raw object:

_SENTINEL = object()
_LiteralSentinel = Literal[_SENTINEL]

to mean that you could only pass the literal object my_module._SENTINEL the function, but it doesn’t (for sensible reasons). But if it did, we could avoid all this!

@johnkerl johnkerl changed the title typeguard==4.2.1, make requirements_dev.txt canonical [python/ci] typeguard==4.2.1, make requirements_dev.txt canonical Mar 25, 2024
@johnkerl johnkerl merged commit 36d5b6e into main Mar 26, 2024
31 checks passed
@johnkerl johnkerl deleted the rw/tg4.2 branch March 26, 2024 20:56
Copy link

The backport to release-1.8 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-release-1.8 release-1.8
# Navigate to the new working tree
cd .worktrees/backport-release-1.8
# Create a new branch
git switch --create backport-2314-to-release-1.8
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick --mainline 1 36d5b6e7e9657ba9d41e9f71c0b19d0468dc1f79
# Push it to GitHub
git push --set-upstream origin backport-2314-to-release-1.8
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-release-1.8

Then, create a pull request where the base branch is release-1.8 and the compare/head branch is backport-2314-to-release-1.8.

github-actions bot pushed a commit that referenced this pull request Mar 26, 2024
…#2314)

* use requirements_dev.txt in CI, setup.py

* use typeguard>=4.2, fixup some type checks

agronholm/typeguard#442 was fixed in typeguard 4.2.0

* cr feedback: `_UNSET`/`_Unset` sentinel value/type
johnkerl pushed a commit that referenced this pull request Mar 26, 2024
…#2314) (#2332)

* use requirements_dev.txt in CI, setup.py

* use typeguard>=4.2, fixup some type checks

agronholm/typeguard#442 was fixed in typeguard 4.2.0

* cr feedback: `_UNSET`/`_Unset` sentinel value/type

Co-authored-by: Ryan Williams <ryan.williams@tiledb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants