Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-40941: Unify implicit and explicit state in the frame and generator objects into a single value. #20803

Merged
merged 10 commits into from
Jul 17, 2020

Conversation

markshannon
Copy link
Member

@markshannon markshannon commented Jun 11, 2020

Merges the explicit state in PyFrameObject.f_excuting and PyGenObject.gi_running, and the implicit state in PyFrameObject.f_stacktop == NULL and PyFrameObject.f_last == -1 into a single enum field in PyFrameObject

Since we no longer need to test PyFrameObject.f_stacktop == NULL , The f_stacktop pointer can be replaced with a f_stackdepth integer, which makes for simpler code when iterating over the stack and avoids the potential hazard of NULL pointers.

There are three benefits to these changes:

  1. The code is more robust and, IMO, easier to understand, as all state is now explicit.
  2. It carries additional information about the state of the frame. Information about whether a frame exiting by return or by raise is available.
  3. A modest reduction in size of frame and generator objects.

https://bugs.python.org/issue40941

@markshannon markshannon requested a review from 1st1 as a code owner June 11, 2020 13:27
@markshannon markshannon changed the title bpo-40941: Unify implicit and explicit the state in frame and generator into a single value. bpo-40941: Unify implicit and explicit state in the frame and generator objects into a single value. Jun 11, 2020
@markshannon markshannon added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jun 11, 2020
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @markshannon for commit 867427e269564d3b16b7c5580343556cbfdcad7c 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jun 11, 2020
@serhiy-storchaka serhiy-storchaka self-requested a review July 9, 2020 10:20
@pablogsal pablogsal self-requested a review July 9, 2020 12:06
FRAME_RETURNED = 1,
FRAME_RAISED = 2,
FRAME_CLEARED = 3
} PyFrameState;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this default to the size of an int, rather than a char? I'm not yet sure whether the previous explicit sizing on some variables was actually useful, or just lost to alignment, but it seems like a signed char should still be more than enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed it to a signed char

/* Forbid jumps upon a 'return' trace event (except after executing a
* YIELD_VALUE or YIELD_FROM opcode, f_stacktop is not NULL in that case)
* YIELD_VALUE or YIELD_FROM opcode)
* and upon an 'exception' trace event.
* Jumps from 'call' trace events have already been forbidden above for new
* frames, so this check does not change anything for 'call' events. */

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "above" refer to the code just deleted, and no longer visible to future readers? Can this comment be reworded to something with fewer separate branching conditions, such as "Jumps can only be made via a trace function triggered by a line trace event on a currently executing or suspended frame" I am assuming that this is not actually a new restriction that needs to be documented, but that is an assumption; I won't be shocked if I missed something outside the diff.

"can only jump from a 'line' trace event");
return -1;
return -1;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a "call" trace also a "line" trace? If not, it would make sense to combine the 4 invalid states into a single error message, and it might make sense to include the invalid state as an exception attribute, or even as part of the string. if that is much easier.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No a "call" trace is not a "line" trace.

case FRAME_EXECUTING:
case FRAME_SUSPENDED:
/* You can only do this from within a trace function, not via
* _getframe or similar hackery. */

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this restriction? Is it just because "better safe than sorry, and we couldn't add the restriction later"? Is it an intentional limitation to be kind to PyPy and optimization?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It a pre-existing restriction. There is no change to behavior here.

}
f->f_stackdepth = 0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I almost want to change this to take advantage of f_stackdepth already being an integer count of the number of times to loop, but maybe that is too clever for debugging, and should be left to the compiler.

while (f->f_stackdepth--) Py_XDRECREF(f->f_valuestack[f->f_stackdepth]) ;

seems not worse, but if I missed something by not testing, that argues for not making similar changes. :{

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should leave optimizing the loop to the compiler, although using idiomatic forms usually helps the compiler.
The two forms are equivalent.
The standard for loop from 0 to the limit is idiomatic C and easier to read.

err = gen_close_iter(yf);
gen->gi_running = 0;
gen->gi_frame->f_state = state;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the previous state was not FRAME_SUSPENDED, what would that mean, and should the generator really return to that state? (applies several places) I had sort of thought that the only backwards transitions were FRAME_EXECUTING to FRAME_SUSPENDED, and FRAME_CLEARED to FRAME_CREATED when the memory was reallocated to a different frame.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that the state must be FRAME_SUSPENDED, but if were some other state were possible this code would still be correct. So I'm inclined to leave it as is.

static PyGetSetDef gen_getsetlist[] = {
{"__name__", (getter)gen_get_name, (setter)gen_set_name,
PyDoc_STR("name of the generator")},
{"__qualname__", (getter)gen_get_qualname, (setter)gen_set_qualname,
PyDoc_STR("qualified name of the generator")},
{"gi_yieldfrom", (getter)gen_getyieldfrom, NULL,
PyDoc_STR("object being iterated by yield from, or None")},
{"gi_running", (getter)gen_getrunning, NULL, NULL},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this reordering break any binary guarantees for any of the types?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. This is opaque to C extensions.

Having f_stackdepth <= 0 ensures that invalid
values are not visible to the cycle GC.
We choose -1 rather than 0 to assist debugging.
*/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GC doesn't seem to have been modified; does it really pay attention to stackdepth, as opposed to the actual pointer that was there previously?

Python/ceval.c Outdated
if (tstate->c_tracefunc != NULL) {
/* Temporarily set frame state to RAISED for benefit of frame.setlineno */
assert(f->f_state == FRAME_EXECUTING);
f->f_state = FRAME_RAISED;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems I missed another backwards state transition. Maybe it would be good to explain the state machine where the states names are defined.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a hack, to keep frame.setlineno happy.
I've added FRAME_UNWINDING to mark that an exception is being unwound in the frame, but that the frame has not completed yet.

@@ -4,6 +4,16 @@
# error "this header file must not be included directly"
#endif

/* These values are chosen so that all tests involve comparing to zero. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"all" tests?

I think this means ">0 implies finished; <=0 implies not-yet-finished", but I'm inferring that from the enum values and not the comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, making the specification explicit would be nice.

PyTryBlock f_blockstack[CO_MAXBLOCKS]; /* for try and loop blocks */
PyObject *f_localsplus[1]; /* locals+stack, dynamically sized */
};

static inline int _PyFrameIsRunnable(struct _frame *f) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_PyFrame_IsRunnable?

Or maybe _PyFrame_IS_RUNNABLE?

Copy link
Contributor

@aeros aeros Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to _PyFrame_IsRunnable. I think a similar format should be used with the other functions as well, that naming style fits better with existing members.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_PyFrame_IsRunnable seem more in keeping with the rest of the code base. So I've gone with that.

@aeros aeros requested a review from njsmith July 10, 2020 02:25
if (f && f->f_stacktop) {
if (f) {
Copy link
Contributor

@aeros aeros Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to replace the && f->f_stacktop with && f->f_stackdepth >=0 or at least an assert(f->f_stackdepth >= 0) on line 329?

Admittedly, I'm a bit out of my depth when it come to the low-level generator details, but at a glance it seems to me like we would not want to start the yield from if the generator's frame has a stack depth of -1 (or lower, somehow?).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, the stack depth can never be negative. There's an assertion on line 344 that it is positive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I saw that assertion on line 344. If it's realistically not going to be negative, we probably only need the assertion prior to modifying the stack depth (particularly for decrements).

Comment on lines 448 to 460
/* Pop subiterator from stack */
ret = *(--gen->gi_frame->f_stacktop);
gen->gi_frame->f_stackdepth--;
Copy link
Contributor

@aeros aeros Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have an assert(gen->gi_frame->f_stackdepth >= 0) sanity check here after decrementing the stack depth? I don't know that it's necessary here, but since it gets removed with -O anyways it won't incur a real cost.

Copy link
Contributor

@aeros aeros Jul 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: assert(gen->gi_frame->f_stackdepth > 0); was added before decrement in a22412c.

return f->f_state == FRAME_EXECUTING;
}

static inline int _PyFrameHasCompleted(struct _frame *f) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just something to be vary about here...there was recently https://bugs.python.org/issue39542#msg372962 where a static inline function failed to inline on mac, albeit it was through a more complicated call chain.

Not sure if you have the capability to test this but it'd be nice if someone can benchmark this on mac since this is such a critical code path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should trust the big three compilers to do their jobs without constantly verifying it.
Clang will inline this very simple function, probably even if we don't tell it to.

@@ -4,6 +4,16 @@
# error "this header file must not be included directly"
#endif

/* These values are chosen so that all tests involve comparing to zero. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, making the specification explicit would be nice.

gen_getrunning(PyGenObject *gen, void *Py_UNUSED(ignored))
{
if (gen->gi_frame == NULL) {
Py_INCREF(Py_False);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Py_RETURN_FALSE

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

cr_getrunning(PyCoroObject *coro, void *Py_UNUSED(ignored))
{
if (coro->cr_frame == NULL) {
Py_INCREF(Py_False);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Py_RETURN_FALSE

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@markshannon markshannon force-pushed the merge-frame-gen-state branch from 593d935 to 38792f2 Compare July 14, 2020 16:14
@markshannon
Copy link
Member Author

Thanks for the reviews. I think I've addressed all the comments.

ret = _gen_throw((PyGenObject *)yf, close_on_genexit,
typ, val, tb);
tstate->frame = f;
gen->gi_running = 0;
gen->gi_frame->f_state = state;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: for symmetry, I think the new modified line would be better before the tstate->frame = f; line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@markshannon markshannon merged commit cb9879b into python:master Jul 17, 2020
@markshannon markshannon deleted the merge-frame-gen-state branch July 17, 2020 10:44
arun-mani-j pushed a commit to arun-mani-j/cpython that referenced this pull request Jul 21, 2020
…or objects into a single value. (pythonGH-20803)

* Merge gen and frame state variables into one.

* Replace stack pointer with depth in PyFrameObject. Makes code easier to read and saves a word of memory.
shihai1991 pushed a commit to shihai1991/cpython that referenced this pull request Aug 4, 2020
…or objects into a single value. (pythonGH-20803)

* Merge gen and frame state variables into one.

* Replace stack pointer with depth in PyFrameObject. Makes code easier to read and saves a word of memory.
shihai1991 pushed a commit to shihai1991/cpython that referenced this pull request Aug 20, 2020
…or objects into a single value. (pythonGH-20803)

* Merge gen and frame state variables into one.

* Replace stack pointer with depth in PyFrameObject. Makes code easier to read and saves a word of memory.
xzy3 pushed a commit to xzy3/cpython that referenced this pull request Oct 18, 2020
…or objects into a single value. (pythonGH-20803)

* Merge gen and frame state variables into one.

* Replace stack pointer with depth in PyFrameObject. Makes code easier to read and saves a word of memory.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants