Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comprehensions iterating over locals() may break under some trace funcs in 3.12 #105256

Closed
carljm opened this issue Jun 2, 2023 · 4 comments
Closed
Assignees
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes

Comments

@carljm
Copy link
Member

carljm commented Jun 2, 2023

It is already the case in all recent Python versions that the lazy-update behavior of the dictionary returned from locals() causes weird frame-introspection-sensitive semantics around lazy iteration of locals(). E.g. this code will work fine:

def f(a, b):
    ret = {}
    for k, v in locals().items():
        ret[k] = v
    return ret

f(1, 2)

But run this under a tracing function that simply accesses frame.f_locals:

import sys

def tracefunc(frame, *a, **kw):
    frame.f_locals
    return tracefunc

sys.settrace(tracefunc)

f(1, 2)

And suddenly you instead get RuntimeError: dictionary changed size during iteration.

The reason is because accessing frame.f_locals triggers a lazy update of the cached locals dictionary on the frame from the fast locals array. And this is the same dictionary returned by locals() (it doesn't create a copy), and so the tracing function has the side effect of adding the keys k and v to the locals dictionary, while it is still being lazily iterated over. Without the access of frame.f_locals, nothing triggers this locals-dict lazy update after the initial call to locals(), so the iteration is fine.

This is a pre-existing problem with the semantics of locals() generally, and there are proposals to fix it, e.g. PEP 667.

What is new in Python 3.12 is that PEP 709 means comprehensions can newly be exposed to this same issue. Consider this version of the above function:

def f(a, b):
    return {k: v for k, v in locals.items()}

Under PEP 709, k and v are now part of the locals of f (albeit isolated and only existing during the execution of the comprehension), which means this version of f is now subject to the same issue as the previous for-loop version, if run under a tracing func that accesses f_locals.

Linked PRs

@carljm
Copy link
Member Author

carljm commented Jun 2, 2023

I recently discovered code of this form in numpy. The fix is to eagerly iterate locals() and extract the data you need, and avoid running any additional code during iteration of locals():

def f(a, b):
    return {k: v for k, v in list(locals.items())}

Given the inherently unstable nature of the locals() dictionary under current locals() semantics, I think this approach is generally advisable when iterating over locals(). I already submitted this fix to numpy and it was merged: numpy/numpy#23855

@carljm
Copy link
Member Author

carljm commented Jun 2, 2023

My current plan to address this is to a) document it in the What's New for Python 3.12, and b) run a search over popular PyPI projects to look for comprehensions that iterate over locals() and submit the same fix to them.

I think longer-term the right fix for this issue is something like PEP 667, so that locals() has more predictable behavior.

@carljm carljm self-assigned this Jun 2, 2023
@gaogaotiantian
Copy link
Member

Just to add that locals() has been a painful issue for pdb for a long time. pdb had to create a cache for f_locals to make most of the code execution work. However, any read attempt to the locals() would clear the code execution result (the variables are temporarily stored in locals() dict).

We have fixed a bug where ll will clear the variable changes in #101673, we have an open issue where up and down will clear the variable changes #102864 which is difficult to fix because the code is in bdb. We have to work around the locals() issue - but the user can always execute locals() to check all the local variables and clear the variable changes.

This problem has been there since at least 3.8(that's what I checked, I assume it would be even earlier). I strongly agree that we should do something about locals() to make it work more reliably.

@carljm
Copy link
Member Author

carljm commented Jul 3, 2023

I used GitHub code search to search for the regex for\s*\w+\s*in\s+locals\(\)[\]}][^@], and found only two occurrences of a comprehension iterating over locals(): https://github.com/sanjar-notes/python3_notes/blob/7f0da432903f1c937816769996c47af520215ed0/home/4_resource_itineraries/1_Python_Masterclass_Tim_Buchalka/2._Learning_Python/A._Basic_-_no_OOP/7._Functions_and_Modules/4._Nested_class_and_functions/spam.py#L7 and https://github.com/Xiwei-Wang/cs225sp20_env/blob/d38c48b72580ba7fa172f0cc7e34b3157c13a515/demo.py#L50. Neither is in a reusable library; both look like toy scripts that are unlikely to be run under tracing. So I'm going ahead with an update to What's New for this.

carljm added a commit to carljm/cpython that referenced this issue Jul 3, 2023
carljm added a commit to carljm/cpython that referenced this issue Jul 3, 2023
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jul 6, 2023
…honGH-106378)

(cherry picked from commit 13aefd1)

Co-authored-by: Carl Meyer <carl@oddbird.net>
carljm added a commit that referenced this issue Jul 6, 2023
…-106378) (#106471)

gh-105256: What's New note for comprehension over locals() (GH-106378)
(cherry picked from commit 13aefd1)

Co-authored-by: Carl Meyer <carl@oddbird.net>
@carljm carljm closed this as completed Jul 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes
Projects
None yet
Development

No branches or pull requests

3 participants