-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-117139: A StackRef Debugging Mode #121134
base: main
Are you sure you want to change the base?
Conversation
I don't think you need the Something like this: THE_MAP: dict[uintptr_t, object] = {}
THE_COUNTER = 0
class StackRef:
bits: uintptr_t
def __init__(self, obj):
THE_COUNTER += 1
self.bits = THE_COUNTER
THE_MAP[self.bits] = obj
def close(self):
del THE_MAP[self.bits]
def dup(ref: StackRef) -> StackRef:
obj = THE_MAP[ref.bits]
return StackRef(obj)
def steal_obj(ref: StackRef) -> object:
return THE_MAP.pop(ref.bits)
def steal_ref(ref: StackRef) -> StackRef:
return StackRef.from_obj_steal(self.steal_obj())
@classmethod
def from_obj_steal(obj: object) -> StackRef:
res = StackRef(obj)
Py_DECREF(object)
return res
@classmethod
def from_obj_new(obj: object) -> StackRef:
return StackRef(obj) The hashtable in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much extra effort would it be do this for the default build?
Can you try this on the full set of tests, just to see if it would be acceptable on CI?
@@ -551,6 +564,7 @@ jobs: | |||
- check_generated_files | |||
- build_macos | |||
- build_macos_free_threading | |||
- build_macos_free_threading_with_stackref_debug |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we do this for normal builds as well?
The linux machines are cheapest, so we should probably use those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't because normal builds require all the inline functions to be macros #121263. And these are too complicated to fit into a single macro.
--prefix=/opt/python-dev \ | ||
--with-openssl="$(brew --prefix openssl@3.0)" | ||
- name: Build CPython | ||
run: make -j8 | ||
- name: Display build info | ||
run: make pythoninfo | ||
- name: Tests | ||
run: make test | ||
# Stackref debug is 3.5x slower than normal CPython, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using the set of tests for profiling? It is about 40 tests instead of the usual ~400, so would give decent coverage without being too slow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry how do I run those specific tests? What command with python -m test
?
I did. It fails
mainly because they count allocations/test out of memory situations, and the hashtable adds random allocations which breaks their careful calculations. On my computer on free threaded builds with the stackref debugging mode, it takes 5 minutes to run the whole test suite. macOS is the fastest/"cheapest" runner we have right now because Cirrus Labs is sponsoring bare-metal runners IIRC. Without stackref debug mode, it takes 2 min 7 s. So roughly a 2.5x slowdown. Factoring that in, the fastest Cirruslabs runners takes 6 min to complete on this PR. So it should take 15 minutes to run the whole test suite. We would need to skip those tests above too, as those are known failures. |
That's too bad. I wonder what it would take to fix those tests to be less fragile (tightly coupled to fine details)? I've opened gh-121978 so we can consider that separately. |
This enforces HPy-like semantics in CPython's eval loop. It's checked at runtime with a new handle table. This catches errors like double decref, introducing dead references to the stack, etc.
See https://github.com/faster-cpython/ideas/blob/2beefd4da2bf956c837e8fb7097b7a43c53f5735/3.14/stackref_semantics.md
for more info.
I set up a single runner that just runs
python -m test test_asyncio
as this mode is 2.5x slower than normal CPython. That is usually enough to catch almost all bugs. Besides, this mode is also not fully compatible with all our tests that introspect CPython.This PR depends on #121127 being merged first.