-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-89189: More compact range iterator #27986
Conversation
serhiy-storchaka
commented
Aug 27, 2021
•
edited by AlexWaygood
Loading
edited by AlexWaygood
- Issue: More compact range iterator #89189
See also #28176. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has merge conflicts now.
When you're done making the requested changes, leave the comment: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Just a few nits.
return Py_BuildValue("N(N)O", _PyEval_GetBuiltin(&_Py_ID(iter)), | ||
range, Py_None); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we care, but we might -- am I right that this writes pickles that can't be read by 3.11 or before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, absolutely no. I would never propose such change.
Internals will be different, but the unpickled iterator wil produce the same values.
Objects/rangeobject.c
Outdated
PyObject *new_len = PyNumber_Subtract(r->len, state); | ||
if (new_len == NULL) | ||
return NULL; | ||
Py_SETREF(r->len, new_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little scary -- if the new length is set but computing the new start fails, and the original iterator is still accessible, it will be truncated at the end rather than at the start. Let's update both fields after all errors have been checked.
Ditto in next().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Although __setstate__
is only supposed to be called for just created object when unpickle or make a deep copy, so there are no references to that iterator yet.
But there were other issue here. state
could be a borrowed reference to r->len
. It is used in two places: with PyNumber_Subtract()
and with PyNumber_Multiply()
. The second time it is used after setting the new r->state
, when it can refer to a freed memory.
next()
does not have such issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems that it never happens in normal code. You have to call __setstate__()
manually with the out-of-range value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may or may not want to change one thing, but if you do, no need to ask for a new review.
PyObject *tmp = r->start; | ||
r->start = new_start; | ||
Py_SETREF(r->len, new_len); | ||
Py_DECREF(tmp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a scenario where we can't just use
Py_SETREF(r->len, new_len);
Py_SETREF(r->start, new_start);
(which would be a little easier to follow).
Both new_len
and new_start
are definitely new references we own, so even if state
was a borrowed reference to r->len
everything would be all right.
I'll leave it up to you though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Py_SETREF can only help when you assign a single object attribute, or if the attributes are independent. Two sequential Py_SETREF
s can be a sign of a bug.
We do not check the type of state
. It can be an arbitrary Python object with methods __lt__
, __gt__
, __mul__
and __rsub__
. Therefore we do not control the types of new_len
and new_start
. Therefore we do not control the types of r->len
and r->start
. They can have __del__
methods which release the GIL after assignment in Py_SETREF
. When the GIL is released, the iterator object can be used in other thread when it is in inconsistent state -- new r->len
and old r->start
.
It never occurs in normal code (state
is always an exact int), but if you try hard, you perhaps can reproduce this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, that's a scenario I hadn't considered. Maybe the next time I am asked in some kind of Q&A or interview "do you have any regrets" I should mention __del__
methods.
* main: (112 commits) pythongh-99894: Ensure the local names don't collide with the test file in traceback suggestion error checking (python#99895) pythongh-99612: Fix PyUnicode_DecodeUTF8Stateful() for ASCII-only data (pythonGH-99613) Doc: Add summary line to isolation_level & autocommit sqlite3.connect params (python#99917) pythonGH-98906 ```re``` module: ```search() vs. match()``` section should mention ```fullmatch()``` (pythonGH-98916) pythongh-89189: More compact range iterator (pythonGH-27986) bpo-47220: Document the optional callback parameter of weakref.WeakMethod (pythonGH-25491) pythonGH-99905: Fix output of misses in summarize_stats.py execution counts (pythonGH-99906) pythongh-99845: PEP 670: Convert PyObject macros to functions (python#99850) pythongh-99845: Use size_t type in __sizeof__() methods (python#99846) pythonGH-99877) Fix typo in exception message in `multiprocessing.pool` (python#99900) pythongh-87092: move all localsplus preparation into separate function called from assembler stage (pythonGH-99869) pythongh-99891: Fix infinite recursion in the tokenizer when showing warnings (pythonGH-99893) pythongh-99824: Document that sqlite3.connect implicitly open a transaction if autocommit=False (python#99825) pythonGH-81057: remove static state from suggestions.c (python#99411) Improve zip64 limit error message (python#95892) pythongh-98253: Break potential reference cycles in external code worsened by typing.py lru_cache (python#98591) pythongh-99127: Allow some features of syslog to the main interpreter only (pythongh-99128) pythongh-82836: fix private network check (python#97733) Docs: improve accuracy of socketserver reference (python#24767) ...