Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type annotations to sortedcontainers #107

Closed
wants to merge 14 commits into from

Conversation

althonos
Copy link

@althonos althonos commented Nov 29, 2018

Hi @grantjenks,

this PR adds type annotations to the sortedcontainers package. They should be consistent with their list, set and dict annotations from the typeshed annotations. All collections are invariant generics.

Here are the current quirks:

  • __new__ annotations are not supported, so mypy does not understand that creating a SortedList with a key argument returns a SortedKeyList instance (this is a mypy bug, see overloading __new__ python/mypy#3307)
  • SortedItemsView raises an error because of supposedly incompatible base classes (this is a mypy bug, see Definition of "__iter__" in "ItemsView" is incompatible with definition "Sequence" python/mypy#5973) this has been fixed in mypy
  • Until typing exposes Protocols, it is not really possible to set the return type of key functions. This means that the acceptable type for a key is Callable[[T], Any]. In particular, this also means that bisect_key_left and bisect_key_right accept Any. The other way around would be to make SortedList a generic over both T and K where K is the key type, but that would possibly be a hindrance to end users.
  • SortedKeyList.__contains__(value) should accept anything (as any Sequence), but since the value is passed to self._key, the value should be of type T for the code never to fail. The choices are: either only accept T, typecheck the value in the code, or return False on any TypeError raised by the key. None of them are really satisfactory IMO.
  • SortedDict.setdefault has the same signature than dict.setdefault from the standard library, although in theory the signature should have an overload if not default is given.

Copy link
Owner

@grantjenks grantjenks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good.

I’m a Python typing newbie. Three questions: How do I run it myself? How do I test it? Can I integrate the testing with tox?

@althonos
Copy link
Author

althonos commented Nov 29, 2018

Well, I'm not a typing master myself, but basically, you'll need a typechecker set-up, for instance mypy or pyre (I tested with both of these).

Then you can run the typechecker first of all on the codebase itself (e.g. mypy -p sortedcontainers), and it should not raise any error (except the ones listed above that are to be fixed by the mypy team).

AFAIK, there's no way to test the typing is good (that's like testing a program is without bugs), but maybe you could try to have test cases with erroneous typing to see it is detected (such as appending an int to a SortedList[str]) although I'm not sure how to do it.

Copy link

@bryanforbes bryanforbes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I'm using this project and have written stubs for other projects (asyncpg, sqlalchemy, gino, etc.) I figured I could help out here. I added a bunch of comments for sorteddict that also apply for the other files. Let me know if something I commented on doesn't make sense.

sortedcontainers/sorteddict.pyi Outdated Show resolved Hide resolved
sortedcontainers/sorteddict.pyi Outdated Show resolved Hide resolved
sortedcontainers/sorteddict.pyi Outdated Show resolved Hide resolved
sortedcontainers/sorteddict.pyi Outdated Show resolved Hide resolved
sortedcontainers/sorteddict.pyi Outdated Show resolved Hide resolved
sortedcontainers/sorteddict.pyi Outdated Show resolved Hide resolved
sortedcontainers/sorteddict.pyi Outdated Show resolved Hide resolved
sortedcontainers/sorteddict.pyi Outdated Show resolved Hide resolved
sortedcontainers/sorteddict.pyi Outdated Show resolved Hide resolved
sortedcontainers/sorteddict.pyi Outdated Show resolved Hide resolved
@althonos
Copy link
Author

althonos commented Dec 3, 2018

@bryanforbes: thanks for your review ! I'll add the required changes soon.

Copy link

@bryanforbes bryanforbes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updates so far look great, but I found a few more issues :D

sortedcontainers/sorteddict.pyi Outdated Show resolved Hide resolved
sortedcontainers/sortedlist.pyi Outdated Show resolved Hide resolved
sortedcontainers/sorteddict.pyi Outdated Show resolved Hide resolved
sortedcontainers/sortedlist.pyi Outdated Show resolved Hide resolved
sortedcontainers/sortedlist.pyi Outdated Show resolved Hide resolved
sortedcontainers/sortedset.pyi Outdated Show resolved Hide resolved
sortedcontainers/sortedset.pyi Outdated Show resolved Hide resolved
sortedcontainers/sortedset.pyi Outdated Show resolved Hide resolved
sortedcontainers/sortedlist.pyi Outdated Show resolved Hide resolved
sortedcontainers/sortedlist.pyi Outdated Show resolved Hide resolved
Copy link

@bryanforbes bryanforbes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied the stub files into a project I'm using and with the changes I've outlined in this round of reviews, I get no errors from mypy. I'll approve once the feedback is addressed.

sortedcontainers/sortedlist.pyi Outdated Show resolved Hide resolved
sortedcontainers/sorteddict.pyi Outdated Show resolved Hide resolved
sortedcontainers/sortedlist.pyi Outdated Show resolved Hide resolved
Copy link

@bryanforbes bryanforbes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@bryanforbes
Copy link

Thanks for doing this, @althonos. You beat me to it :).

@althonos
Copy link
Author

althonos commented Dec 7, 2018

@bryanforbes : thanks for reviewing my code ! I let so much slip through it's mostly thanks to you this PR became high-standard enough 😉

@bryanforbes
Copy link

@grantjenks This is ready to merge

Copy link
Owner

@grantjenks grantjenks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm learning a lot looking through these changes. The tone of my questions are curiosity/learning, not challenges. There's 7 questions so far :)

def __init__(self, __map: Mapping[_KT, _VT], **kwargs: _VT) -> None: ...
@overload
def __init__(
self, __iterable: Iterable[Tuple[_KT, _VT]], **kwargs: _VT
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Tuple[_KT, _VT] here mean strictly a tuple? The code will work for any two-element sequence.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a two-value tuple with the generic type _KT as its first element and _VT as its second element. I recommend reading the section on generics in the mypy docs for a more in-depth understanding of the generic types.

def __iter__(self) -> Iterator[_KT]: ...
def __reversed__(self) -> Iterator[_KT]: ...
def __setitem__(self, key: _KT, value: _VT) -> None: ...
def _setitem(self, key: _KT, value: _VT) -> None: ...
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why document protected/private methods? These aren't part of the API.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason I didn't call this out in my review is that sometimes it's useful to have these methods in the stubs so they can be used in subclasses. If these are not intended to be used in subclasses, they should probably be removed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you call a private method from inside a public method the type can then be derived there from the type hints. For instance, PyCharm warns on the last line of this code, because it could derive the return type of the public method pub from the private method __private:

class Klass:
    def __private(self) -> int:
        return 0

    def pub(self):
        return self.__private()

    def pub2(self, a: str):
        pass

O = Klass()
O.pub2(O.pub())

To make sure that the type hints actually are correct within sortedcontainers, it's probably a good idea to commit fully to static type checking.

def __reversed__(self) -> Iterator[_KT]: ...
def __setitem__(self, key: _KT, value: _VT) -> None: ...
def _setitem(self, key: _KT, value: _VT) -> None: ...
def copy(self: _SD) -> _SD: ...
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the annotation for self here and not for the other methods?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy returns the same type as the instance, so we need to mark this in such a way that the type returned from subclasses will be properly derived. Using a generic self is how we do that. If a method doesn't need to return that information, then annotating self isn't needed.

def __copy__(self: _SD) -> _SD: ...
@classmethod
@overload
def fromkeys(cls, seq: Iterable[_T]) -> SortedDict[_T, None]: ...
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be Iterable[_KT]? I think the elements must be hashable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it should be _KT because it would derive _KT from the class's declaration. We should probably declare another TypeVar above that is bound to Hashable.

def pop(self, key: _KT) -> _VT: ...
@overload
def pop(self, key: _KT, default: _T = ...) -> Union[_VT, _T]: ...
def popitem(self, index: int = ...) -> Tuple[_KT, _VT]: ...
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra spaces in int = ... look a bit odd to me. Is there a style guide for these type files?

Also, is int the best choice? Is there no Integral or size_t equivalent?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The typeshed style guide is what most people use.

Yes, int is the right choice. It's what list.popitem() uses in typeshed's typing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, pep8 also suggests this usage of spaces:

When combining an argument annotation with a default value, however, do use spaces around the = sign

def pop(self, key: _KT, default: _T = ...) -> Union[_VT, _T]: ...
def popitem(self, index: int = ...) -> Tuple[_KT, _VT]: ...
def peekitem(self, index: int = ...) -> Tuple[_KT, _VT]: ...
def setdefault(self, key: _KT, default: Optional[_VT] = ...) -> _VT: ...
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why specify default here and not use an overload as with fromkeys?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With fromkeys(), the return value is dependent upon whether value is specified or not. I went and checked the implementation of setdefault() and I can see that this typing is wrong and it should actually be:

    def setdefault(self, key: _KT, default: Optional[_VT] = ...) -> Optional[_VT]: ...

@althonos

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, couldn't this be overloaded to:

@overload
def setdefault(self, key: _KT) -> Optional[_VT]: ...
@overload
def setdefault(self, key: _KT, default: _VT = ...) -> _VT: ...

?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... the stdlib has the following:

    def setdefault(self, k: _KT, default: Optional[_VT] = ...) -> _VT: ...

However, I can do the following:

d = SortedDict()
v = d.setdefault('a') # v is None
v = d.setdefault('a', 1) # v is still None

I'm going to ask in the mypy gitter room, but returning _VT seems wrong based on the code above. I'll let you know what I find out.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bryanforbes : my bad, this would be a thing for dict.get, but not for dict.setdefault, you're right.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@althonos From my conversation in the Gitter room, we should leave this the same as what the stdlib has and I'm going to file an issue to discuss it on typeshed. The Optional[_VT] return type might produce a lot of false-positives.

def __getitem__(self, index: slice) -> List[_KT_co]: ...
def __delitem__(self, index: Union[int, slice]) -> None: ...

class SortedItemsView( # type: ignore
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does type: ignore mean?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, there's an issue with the definitions for ItemsView.__iter__() and Sequence.__iter__() in typeshed. The type: ignore comment tells mypy to ignore the error between those two signatures (until the linked issue gets fixed).

@bryanforbes
Copy link

Great questions, @grantjenks, and I didn't find them as challenges. Hopefully my answers were helpful and not condescending.

@grantjenks
Copy link
Owner

I looked around a bit and searched typing.py but could not find the typing hints for OrderedDict and the like. Are there pyi files for the built-in types? How do I compare what’s here with the standard library?

@althonos
Copy link
Author

@grantjenks: they can be found in the python/typeshed repository: for instance, dict annotations can be found here, and OrderedDict annotations here

@oremanj
Copy link

oremanj commented Jan 15, 2019

Any updates on this? I'm using sortedcontainers in a typed project and it would be great to have the type hints available.

@grantjenks
Copy link
Owner

I'm making progress on reviewing the changes but it's a large addition and a new kind of feature for me.

@viveshok
Copy link

viveshok commented Mar 7, 2019

thanks to everyone involved with this work (starting with sortedcontainers itself), very appreciated

@pisiiki
Copy link

pisiiki commented Oct 4, 2019

I would love to see this merged too. Great work.

@grantjenks
Copy link
Owner

I apologize for the delay. I have recently learned more about mypy and type annotations. I may be able to get something rolled out by the end of the month. Until then, I'm open to the type annotations being added to typeshed: https://github.com/python/typeshed

@grantjenks
Copy link
Owner

Sorry for the great delay in this pull request. I'm keeping it in my Inbox now and looking for time to review the changes and merge.

@nathanielobrown
Copy link
Contributor

Just tried using this and got the following error:

In [1]: from sortedcontainers import SortedList                                                        

In [2]: s: SortedList[int] = SortedList()                                                              
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-7fb24f1c5eef> in <module>
----> 1 s: SortedList[int] = SortedList()

TypeError: 'ABCMeta' object is not subscriptable

Installed with pip install git+git://github.com/althonos/python-sortedcontainers.git@master.

Am I doing something wrong or is this an issue?

@althonos
Copy link
Author

@nathanielobrown : kinda expected, since we are only adding stub files, and not changing anything in the code, including the ABCMeta inheritance.

It probably works with

s: "SortedList[int]" = SortedList() 

@samestep
Copy link

I don't know the answer to 1, but for 2, anecdotally: in PyTorch we're still using the old List[int] style because we need to support Python 3.6.

@antonagestam
Copy link

@grantjenks Perhaps mypyc could be a plausible alternative? https://github.com/mypyc/mypyc

Also: take your time. Open source burnout is real, and it's your free time.

@samestep
Copy link

Also: take your time. Open source burnout is real, and it's your free time.

Ditto on this; please don't take my earlier comments as a demand, just curious about the state of things

@althonos
Copy link
Author

Please don't feel any pressure from my side, I know reviewing large PRs takes time and dedication. Now trying to answer the questions:

I’d like to cythonize sorted containers. Is Cython able to understand pyi files? What’s the integration story there?

Basically your .pyi files are going to be independent from your Cython sources, and Cython cannot generate them (yet?). So you need to maintain the signatures yourself. However the .pyi are often more accurate: you can't really describe an Optional[int] directly in the Cython signatures, it's either int or object. I have a Cython project with type stubs here: althonos/pyrodigal.

With generics like “list[int]” that are available in more recent versions, can those be used here too? Or is it subject to the same compatibility issues Python typically has?

Indeed as @samestep mentioned, for Python 3.6 compatibility you should still use List[int], though it's reaching end-of-life in six months. But nothing stops you from keeping the code compatible with Python 3.6 but making the stubs Python 3.7+ I think.

@samestep
Copy link

But nothing stops you from keeping the code compatible with Python 3.6 but making the stubs Python 3.7+ I think.

@althonos I don't believe this is true; quoting PEP 484:

Annotations must be valid expressions that evaluate without raising exceptions at the time the function is defined (but see below for forward references).

As another example, if I remember correctly, typing something as SortedList[str] from this library raises a runtime error while just SortedList is fine (as is 'SortedList[str]' I believe, although I don't know think that works with mypy). So if you want to support Python 3.6, I don't believe the new list[int] syntax can be used.

(I'm not at my computer right now, but once I am, I'll double-check this and edit my comment accordingly.)

@althonos
Copy link
Author

althonos commented Jun 15, 2021

@samestep : You have activated my trap card! To your PEP 484 I raise you PEP 585!

Joke aside, from Python 3.9 onward, x: SortedList[str] = SortedList() totally works in the Python code because SortedList inherits from MutableSequence, which is now a typing variant, so the metaclass allows indexing at the type level. In previous versions, it's not the case (because the metaclass is still type).

To fix that in the Python code of sortedcontainers consumer there are two choices: either put quotes everywhere (works with Python 3.6+), use the future for PEP 563: from __future__ import annotations to disable evaluation of annotations.

From the perspective of mypy, whatever the version, everything works.

EDIT:

Oh and about the section you quote, it's only relevant for the annotations within Python sources. That PEP explicitly states the following afterwards:

Stub files are files containing type hints that are only for use by the type checker, not at runtime.

So the version of your stub only matters for version of mypy that gets invoked on the stubs.

@samestep
Copy link

Oh and about the section you quote, it's only relevant for the annotations within Python sources. That PEP explicitly states the following afterwards:

Stub files are files containing type hints that are only for use by the type checker, not at runtime.

So the version of your stub only matters for version of mypy that gets invoked on the stubs.

Ah, you are correct; I was very confused when reading the first part of your comment because I tried this in Python 3.6 just now and it didn't work:

>>> l: list[int] = [1, 2, 3]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'type' object is not subscriptable
>>> from __future__ import annotations
  File "<stdin>", line 1
SyntaxError: future feature annotations is not defined

But I had forgotten that this PR only adds the type annotations in .pyi files, so from what you've said, it sounds like it should work fine with Python 3.6 even if those stubs use post-3.6 typing features?

@dpinol
Copy link

dpinol commented Jan 14, 2022

Hi, any chance to advance on this?

@grantjenks
Copy link
Owner

Sorry, no. It still hasn't made it to the top of my list. If those interested want to publish a sortedcontainers-types package with the pyi files, then I'm okay with that.

self,
start: Optional[int] = ...,
stop: Optional[int] = ...,
reverse=bool,
Copy link

@DMRobertson DMRobertson Apr 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be a typo?

Suggested change
reverse=bool,
reverse: bool = ...,

Edit: and similarly in sorteddict.pyi and sortedset.pyi

Comment on lines +150 to +163
def irange(
self,
minimum: Optional[_T] = ...,
maximum: Optional[_T] = ...,
inclusive: Tuple[bool, bool] = ...,
reverse: bool = ...,
): ...
def irange_key(
self,
min_key: Optional[Any] = ...,
max_key: Optional[Any] = ...,
inclusive: Tuple[bool, bool] = ...,
reserve: bool = ...,
): ...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these both return Iterator[_T]? That aside, I think the irange stub is redunant here---it should inherit from the stub for SortedList.irange?

@ml31415
Copy link

ml31415 commented Sep 26, 2023

Guys, this PR is soon reaching its fifth birthday ... that should have been enough time for a review. Just get it merged!

@grantjenks
Copy link
Owner

@ml31415 Sorry, no. It still hasn't made it to the top of my list. If those interested want to publish a sortedcontainers-types package with the pyi files, then I'm okay with that.

@ml31415
Copy link

ml31415 commented Sep 26, 2023

Look @grantjenks this is an open source package, no one needs your goodwill for forking this package. People here try to improve this package in order to reduce the burden of maintenance for all users and maintainers alike, i.e. not create another fork. Everyone tries to avoid forking!

But in order for this collaboration model to work, well written PRs also need to be accepted at some point. And right now this would need to be done by either you, or someone you'd be ready to trust to help you with the maintenance of this project, maybe a former collaborator? Or one of the friendly people here, that wrote PRs and already showed interest in improving this package? It's never a good thing if the future of a package depends only on a single person.

h4l added a commit to h4l/sortedcontainers-stubs that referenced this pull request Oct 9, 2023
grantjenks/python-sortedcontainers#107 (review)

Co-authored-by: David Robertson <david.m.robertson1@gmail.com>
h4l added a commit to h4l/sortedcontainers-stubs that referenced this pull request Oct 9, 2023
h4l added a commit to h4l/sortedcontainers-stubs that referenced this pull request Oct 9, 2023
In the original PR there were different opinions on this, but now that
the types are separate from the code, it won't help the development of
sortedcontainers itself to type the internal methods, so I think it's
best to not publicise them to library users. See:
grantjenks/python-sortedcontainers#107 (comment)
@h4l
Copy link

h4l commented Oct 9, 2023

It looks to me like a good way forward here is to publish a stand-alone type stubs package, and let Grant continue to maintain the implementation package in the way he's most comfortable with. So I've created a separate repo with just the type stubs from this PR, tested the stubs, made some tweaks and published a package on pypi.

If you pip install sortedcontainers-stubs you should get full types for sortedcontainers. Please file an issue on the stubs repo if you find any problems.

@grantjenks
Copy link
Owner

Thanks @h4l !

@grantjenks
Copy link
Owner

Closing in favor of https://pypi.org/project/sortedcontainers-stubs/

If you'd like to add something to the README to mention these are available, PR welcome.

@grantjenks grantjenks closed this Feb 28, 2024
h4l added a commit to h4l/python-sortedcontainers that referenced this pull request Feb 29, 2024
The README now mentions and links to the `sortedcontainers-stubs` package, as discussed in grantjenks#107.
h4l added a commit to h4l/python-sortedcontainers that referenced this pull request Feb 29, 2024
The README now mentions and links to the `sortedcontainers-stubs`
package, as discussed in grantjenks#107.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.