-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-117376: Make Py_DECREF
a macro in ceval.c in free-threaded build
#122975
Conversation
… build `Py_DECREF` and `PyStackRef_CLOSE` are now implemented as macros in the free-threaded build in ceval.c. There are two motivations; * MSVC has problems inlining functions in ceval.c in the PGO build. * We will want to mark escaping calls in order to spill the stack pointer in ceval.c and we will want to do this around `_Py_Dealloc` (or `_Py_MergeZeroLocalRefcount` or `_Py_DecRefShared`), not around the entire `Py_DECREF` or `PyStackRef_CLOSE` call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Can this PR forceinline EDIT: This may not be significant before restoring gh-121338 or applying |
I'm going to run this on the benchmarking infrastructure to confirm it helps. (This will take about 2 hours to get results, and I'll report back).
I agree we should fix this, once we confirm it still matters after #121338 was applied. I think that can be applied as a separate issue from this one, though, and doesn't have to hold this one up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some comment on using PEP 7, although it's not mandatory for old code to follow it but since you're anyway editing things... Line continuations being aligned could be ignored, but having macro parameters names in upper case is usually helpful when you read the macro definition.
Include/internal/pycore_stackref.h
Outdated
# define PyStackRef_CLOSE(stackref) \ | ||
do { \ | ||
_PyStackRef _close_tmp = (stackref); \ | ||
if (!PyStackRef_IsDeferred(_close_tmp)) { \ | ||
Py_DECREF(PyStackRef_AsPyObjectBorrow(_close_tmp)); \ | ||
} \ | ||
} while (0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP 7 requires to:
- use uppercase letters for macro parameters,
- align line continuations
\
The results are here. tl;dr: This definitely makes things moderately faster (around 1%) on the pyperformance suite on free-threaded builds. So +1 from me. Note there are no mac results because our machine is currently down, and no Windows results because PGO is broken for free-threaded builds on Windows. |
The `DEALLOC` argument isn't currently used. _Py_DECREF_SPECIALIZED is the same as `Py_DECREF` in the free-threaded build. This might change in the future, but keep things simpler for now.
Py_DECREF
andPyStackRef_CLOSE
are now implemented as macros in the free-threaded build in ceval.c. There are two motivations;MSVC has problems inlining functions in ceval.c in the PGO build.
We will want to mark escaping calls in order to spill the stack pointer in ceval.c and we will want to do this around
_Py_Dealloc
(or_Py_MergeZeroLocalRefcount
or_Py_DecRefShared
), not around the entirePy_DECREF
orPyStackRef_CLOSE
call.