PEP 669: for loop fires LINE multiple times for body, but only one for `for` #104239

nedbat · 2023-05-06T16:34:00Z

I'm experimenting with the new monitoring interface in PEP 669. The LINE event happens multiple times for the line in the body of a for loop (as I expect), but the for statement itself only gets an event on entering, not each time around the loop.

This code:

import sys

def loop3():
    for i in range(3):
        print(i)
    print("done")

the_code = loop3.__code__

def line_callback(code, line_number):
    assert code == the_code
    print(f"LINE: {line_number}")

my_id = sys.monitoring.COVERAGE_ID
sys.monitoring.use_tool_id(my_id, "repro")
sys.monitoring.register_callback(my_id, sys.monitoring.events.LINE, line_callback)
sys.monitoring.set_local_events(my_id, the_code, sys.monitoring.events.LINE)

loop3()

produces:

LINE: 4
LINE: 5
0
LINE: 5
1
LINE: 5
2
LINE: 6
done

I would expect Line 4 to be monitored after each "LINE: 5", including after the last one.

This is with commit 263abd3 of CPython.

The text was updated successfully, but these errors were encountered:

terryjreedy · 2023-05-06T17:27:20Z

Perhaps because iteration jumps into the middle of the bytecode for the header line, and line events only trigger on the first bytecode of the line. To me, "LINE: An instruction is about to be executed that has a different line number from the preceding instruction." can reasonably be interpreted the way you expected. But I am not sure what 'instruction' means in this context. "bytecode" is specific to CPython and the feature is not meant to be so limited. We need a draft code addition in order to ask such questions.

Anyway, I am glad to see a specific example of monitoring in use.

nedbat · 2023-05-06T20:05:50Z

The disassembly looks like this:

Disassembly of <code object loop3 at 0x1011f7cc0, file "loop3.py", line 3>:
  3           0 RESUME                   0

  4           2 LOAD_GLOBAL              1 (NULL + range)
             12 LOAD_CONST               1 (3)
             14 CALL                     1
             22 GET_ITER
        >>   24 FOR_ITER                13 (to 54)
             28 STORE_FAST               0 (i)

  5          30 LOAD_GLOBAL              3 (NULL + print)
             40 LOAD_FAST                0 (i)
             42 CALL                     1
             50 POP_TOP
             52 JUMP_BACKWARD           15 (to 24)

  4     >>   54 END_FOR

  6          56 LOAD_GLOBAL              3 (NULL + print)
             66 LOAD_CONST               2 ('done')
             68 CALL                     1
             76 POP_TOP
             78 RETURN_CONST             0 (None)

Line 5 ends with a JUMP_BACKWARD to 24, which is in line 4, so shouldn't it call the line callback again?

gaogaotiantian · 2023-05-07T06:32:30Z

Line 5 ends with a JUMP_BACKWARD to 24, which is in line 4, so shouldn't it call the line callback again?

I believe only the first instruction on each line is instrumented as INSTRUMENTED_LINE, which would trigger a LINE event. All the others will be INSTRUMENTED_INSTRUCTION so no LINE event will be triggered.

You can use dis.dis(loop3, adaptive=True) to confirm (of course after instrumenting the code).

I would guess @markshannon expected the tools to handle this with BRANCH event. We can probably produce a LINE event in instrumented branches? Not sure if that's the way to go, might have other side effects(besides the obvious performance hit).

nedbat · 2023-05-07T10:02:31Z

I believe only the first instruction on each line is instrumented as INSTRUMENTED_LINE, which would trigger a LINE event. All the others will be INSTRUMENTED_INSTRUCTION so no LINE event will be triggered.

I don't understand: why do I get three events for line 5?

terryjreedy · 2023-05-07T13:09:42Z

By the hypothesis of me and Tian, because the first bytecode of line 5, the load global of 'print', (and as Tian added, INSTRUMENTED_LINE) is executed 3 times.

nedbat · 2023-05-07T13:14:58Z

By the hypothesis of me and Tian, because the first bytecode of line 5, the load global of 'print', (and as Tian added, INSTRUMENTED_LINE) is executed 3 times.

If that is correct, and the behavior I'm seeing is expected, then the PEP needs to be updated:

LINE: An instruction is about to be executed that has a different line number from the preceding instruction.

markshannon · 2023-05-09T13:20:35Z

LINE: An instruction is about to be executed that has a different line number from the preceding instruction.

"Preceding" here means the instruction immediately before it in the code object, not the instruction previously executed.

The LOAD_GLOBAL bytecode on line 5 emits a "line" event and the loop iterates 3 times; so three "line 5" events.

Because the jump at the end of line 5 jumps into the middle of line 4, there is only one "line 4" event.

nedbat · 2023-05-09T15:44:00Z

I thought that debuggers would use the LINE event to step by lines, but maybe I misunderstood. If they did, stepping wouldn't behave properly, because I'd expect a debugger to visit line 4 a number of times.

markshannon · 2023-05-09T16:31:37Z

Maybe we could trigger LINE events for any instruction that has any predecessor with a different line number.
It would complicate the instrumentation code, as we would need to mark jump targets.

We also need to maintain the behavior of sys.settrace, which limits what we can do a bit, but that code needs to handle extra line events anyway.

@fabioz any thoughts on this?

terryjreedy · 2023-05-10T00:15:39Z

IDLE's bdb-based debugger on 3.12.0a7 stops 4 times on line 4 (3 + 1 for Stopping) and 3 times for line 5. I presume stepping in pdb, also based on bdb, does the same. A new debugger should continue stopping on the header line when stepping. This is even more important for while loops, where one might want to see local values to debug the loop condition.

gaogaotiantian · 2023-05-10T02:36:24Z

I agree that we should stop at the for and while line. We only need to change the branch target instruction to INSTRUCTION_LINE and handle the extra LINE event right?

fabioz · 2023-05-10T08:41:51Z

Maybe we could trigger LINE events for any instruction that has any predecessor with a different line number. It would complicate the instrumentation code, as we would need to mark jump targets.

We also need to maintain the behavior of sys.settrace, which limits what we can do a bit, but that code needs to handle extra line events anyway.

@fabioz any thoughts on this?

I think that whenever the line changes (regardless of being marked as the entry point of the line or not) the line event should be triggered (which I guess is mostly what happens with the current sys.settrace, where I guess the logic worked reasonably).

The best place for that logic is probably inside the new monitoring mechanism, otherwise all clients will probably end up having to do some workarounds which would probably be more costly than what can be done inside of the new monitoring code (I'm not even sure what'd be the workaround for clients at this point, maybe doing additional bookeping using both the LINE as well as the JUMP and BRANCH and then deciding whether it changed -- but it seems more costly doing it from the outside... doing it with INSTRUCTION with that logic would probably be even more costly).

markshannon · 2023-05-10T09:26:47Z

My original thinking was that this would be too slow.
But, as you point out, it is the overall cost (VM plus tool) that really matters, as long as the cost is zero when turned off.

So here's a plan (for LINE events, nothing else changes)

Instrument all instructions where a possible predecessor has a different line number, which includes all exception handlers
Check dynamically for a new line in the INSTRUMENTED_LINE instruction
Simplify legacy tracing support, as it only needs to handle backwards jumps within lines, everything else is handled by the INSTRUMENTED_LINE instruction

This should also fix #103471

nedbat added the type-bug An unexpected behavior, bug, or error label May 6, 2023

nedbat assigned markshannon May 6, 2023

markshannon mentioned this issue May 9, 2023

On current main (after applying PEP 669) the tracing behavior changed and can notify about the same line more than once. #103471

Closed

markshannon mentioned this issue May 11, 2023

GH-103082: Filter LINE events in VM, to simplify tool implementation. #104387

Merged

markshannon closed this as completed in #104387 May 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PEP 669: for loop fires LINE multiple times for body, but only one for `for` #104239

PEP 669: for loop fires LINE multiple times for body, but only one for `for` #104239

nedbat commented May 6, 2023 •

edited

Loading

terryjreedy commented May 6, 2023 •

edited

Loading

nedbat commented May 6, 2023 •

edited by AlexWaygood

Loading

gaogaotiantian commented May 7, 2023

nedbat commented May 7, 2023

terryjreedy commented May 7, 2023

nedbat commented May 7, 2023

markshannon commented May 9, 2023

nedbat commented May 9, 2023

markshannon commented May 9, 2023

terryjreedy commented May 10, 2023

gaogaotiantian commented May 10, 2023

fabioz commented May 10, 2023

markshannon commented May 10, 2023

PEP 669: for loop fires LINE multiple times for body, but only one for for #104239

PEP 669: for loop fires LINE multiple times for body, but only one for for #104239

Comments

nedbat commented May 6, 2023 • edited Loading

terryjreedy commented May 6, 2023 • edited Loading

nedbat commented May 6, 2023 • edited by AlexWaygood Loading

gaogaotiantian commented May 7, 2023

nedbat commented May 7, 2023

terryjreedy commented May 7, 2023

nedbat commented May 7, 2023

markshannon commented May 9, 2023

nedbat commented May 9, 2023

markshannon commented May 9, 2023

terryjreedy commented May 10, 2023

gaogaotiantian commented May 10, 2023

fabioz commented May 10, 2023

markshannon commented May 10, 2023

PEP 669: for loop fires LINE multiple times for body, but only one for `for` #104239

PEP 669: for loop fires LINE multiple times for body, but only one for `for` #104239

nedbat commented May 6, 2023 •

edited

Loading

terryjreedy commented May 6, 2023 •

edited

Loading

nedbat commented May 6, 2023 •

edited by AlexWaygood

Loading