-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-45250: fix docs regarding __iter__
and iterators being inconsistently required by CPython
#29170
Conversation
Misc/NEWS.d/next/Documentation/2021-10-22-12-09-18.bpo-45250.Iit5-Y.rst
Outdated
Show resolved
Hide resolved
I'm strongly opposed to changing the meaning of a term that has had a clear definition for decades (and is implemented as Addendum: As a typeshed maintainer I see the problem of ill-defined terms and protocols in Python everyday ("file-like objects" are my personal bane). Making existing clear definitions less clear is not the right way forward. |
The issue is the definition as written down is wrong/inaccurate from the perspective of Python itself. So it isn't that this redefines a term as much as actually makes the glossary reflect the real world as to how Python itself uses and considered what iterators are (i.e. this properly reflects what And I don't quite understand how the definition is ambiguous? An iterator defines What's your proposal otherwise? To create a brand new term of what e.g. a Do note that Guido and other folks agree with this plan in https://mail.python.org/archives/list/python-dev@python.org/thread/3W7TDX5KNVQVGT5CUHBK33M7VNTP25DZ/#3W7TDX5KNVQVGT5CUHBK33M7VNTP25DZ, so this isn't entirely without discussion and some agreement. We can ask the SC to make a final call on this if you really want to (I will abstain from voting on it if it comes to that). |
(I'll use the terms "partial iterator" for objects just defining I'm not looking at this from the perspective how this is implemented in Python, but from a user's perspective. I don't think it's correct to say Python considers an iterator to have only Practically, the status quo (some function, methods, statements don't require I see two solutions to this problem:
|
I think this is where our views are differing. It's an unfortunate side-effect/bug, from my perspective, that because In other words I don't view Maybe this is suggesting it's time to have a lower-level |
The relevant python-dev thread doesn't seem to have reached consensus and died out. |
I was planning to ask the SC to make a call. |
I would absolutely make the argument that it isn't an iterator because it doesn't define It seems to me that you're proposing a change to the language here, not just to the documentation. As things stand today For instance, this recipe is in the def sliding_window(iterable, n):
# sliding_window('ABCDEFG', 4) -> ABCD BCDE CDEF DEFG
it = iter(iterable)
window = collections.deque(islice(it, n), maxlen=n)
if len(window) == n:
yield tuple(window)
for x in it:
window.append(x)
yield tuple(window) As things are defined today, this function will work for all iterables, because an iterable is defined to return an iterator from If we change the language definition to say that As things stand today, if an iterator has |
There are more recipes in the itertools documentation that highlight how common of a pattern it is to rely on iterators having an def grouper(iterable, n, fillvalue=None):
"Collect data into non-overlapping fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
args = [iter(iterable)] * n
return zip_longest(*args, fillvalue=fillvalue) While I'd agree that repices show usage of the language and do not define it, I can't help but think that those recipes show that solving iteration problems with patterns that rely on iterators being iterable (as-in having an This makes this change a breaking change and unfortunately one that makes code that uses such patterns have a hidden bug, as most iterators will have an |
Hmm -- the name chosen for that parameter is "iterable", not "iterator". So it seems to me that it's expecting that all iterables will return an iterator when passed to I will say that it took me years to clearly "get" the distinction myself, but I think there are two things here: An iterator returns an item when passed to An iterable returns an iterator when passed to the In addition, it is a very common, very useful, and highly recommended convention that all iterators also be iterables, i.e. an Personally, I think it's both correct and useful to define things this way -- that is "iterator" is about the iteration, and "iterable" is about being able to produce an iterator. It is very, very, convenient that in most places in Python, one can pass either an iterator or iterable in places where an iterator is required -- this saves us from having to call But they are still distinct protocols, we should make that clear in the docs. All that being said: an enormous amount of code (including most of itertools) expects iterables rather than iterators, so the docs should make it very clear that an iterator that does not support |
I'm sorry, I should have been clearer. Even if you pass an iterable to the # here you make an iterator from whatever iterable you pass in
args = [iter(iterable)] * n
# that iterator is then passed to `zip_longest`, which will in turn
# call `iter` on the iterator this function has already created.
return zip_longest(*args, fillvalue=fillvalue) Even if we agree that iterators are not always iterable and therefore not suitable for this recipe, if the actual iterable you pass to this recipe has a non-iterable iterator, this recipe will still fail. This gist demonstrates what I mean. |
Yes, but that's not all it's assuming. It's also assuming that the iterator returned by As things are defined today, that's a valid assumption, because iterators are required to be iterable. If we change the documentation to say that iterators are not required to be iterable, this function will go from working for all iterables to working for only some iterables (those whose iterators are iterable). And we won't have any word to describe the subset of iterables for which the function works.
The docs today clearly specify that all iterators must be iterable. This isn't clarifying the requirements for iterators, it's changing them. |
…from the language reference
It looks like even the interpreter core assumes that all iterators are iterable. Unpacking like So an iterable whose iterators are not iterable can be used in a >>> class Iterator:
... def __next__(self):
... if hasattr(self, "consumed"):
... raise StopIteration
... self.consumed = True
... return 42
...
>>> class Iterable:
... def __iter__(self):
... return Iterator()
...
>>> list(Iterable())
[42]
>>> first, *rest = Iterable()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'Iterator' object is not iterable Which fits with my mental model that this |
The SC discussed this and the decision was to update the definitions to say it's a CPython implementation detail that it is inconsistent in its checking whether what an iterable returns is a proper iterator or not (i.e. keep the I have not decided if I'm going to update this PR or close it and open a new one with those changes. |
…tors to define `__iter__`
__iter__
and iterators__iter__
and iterators being inconsistently required by CPython
Thanks @brettcannon for the PR 🌮🎉.. I'm working now to backport this PR to: 3.10. |
Sorry @brettcannon, I had trouble checking out the |
…nconsistently required by CPython (pythonGH-29170) It is now considered a historical accident that e.g. `for` loops and the `iter()` built-in function do not require the iterators they work with to define `__iter__`, only `__next__`. (cherry picked from commit be36e06) Co-authored-by: Brett Cannon <brett@python.org>
GH-29650 is a backport of this pull request to the 3.10 branch. |
…nconsistently required by CPython (GH-29170) (GH-29650) It is now considered a historical accident that e.g. `for` loops and the `iter()` built-in function do not require the iterators they work with to define `__iter__`, only `__next__`. (cherry picked from commit be36e06) Co-authored-by: Brett Cannon <brett@python.org>
…tently required by CPython (pythonGH-29170) It is now considered a historical accident that e.g. `for` loops and the `iter()` built-in function do not require the iterators they work with to define `__iter__`, only `__next__`.
It is now considered a historical accident that e.g.
for
loops and theiter()
built-in function do not require the iterators they work with to define__iter__
, only__next__
.https://bugs.python.org/issue45250