-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stash pybind11 data structures in interpreter state dictionary #4293
Conversation
Looks like support for older Python versions is problematic here. The CPython API used in nanobind starts being available in 3.8.. |
I'd be happy to drop 3.6 the second it's a problem, but 3.7 is not EoL yet. But could we just enable it only on 3.8+? ABI compatibility between Python versions doesn't matter. Pretty sure we plan to do an ABI bump, maybe as soon as the next minor (major?) release. |
After turning this over in my mind for a while: Background:
Suggested migration plan, minimizing disruptions:
|
A bit of bad news: It seems that It causes two related tests in the C++/Catch-based test suite to fail.
This is a bit more than I wanted to bite off :-/. I had I opened this PR hoping to quickly port a change from nanobind (which doesn't support either of these more advanced use cases.) |
So, it turns out I just did not look carefully enough. The failing C++ tests were checking something rather trivial -- that the internals capsule can be found in In response to Ralf's comments above, I think that this PR will have no negative consequences on other projects, if it is merged along with the next |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In response to Ralf's comments above, I think that this PR will have no negative consequences on other projects, if it is merged along with the next internals version bump.
Yes, I agree, it's a choice, connected to "do we still support Python 3.7 or not?" I was on the conservative side in my previous comment, but today maybe not :-) IOW either way is fine with me. The only thing I wouldn't want to do is bump the internals
version just because of this PR, but it sounds like we agree on that, too.
include/pybind11/detail/internals.h
Outdated
|
||
dict state_dict; | ||
#if PY_VERSION_HEX < 0x03080000 || defined(PYPY_VERSION) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rwgk Should we ping PyPy to see if we can this fixed in the next release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure. Might be a bit early right now. After this PR is merged might be a better timing for asking. Which brings me to another thought:
It would actually be great to merge this PR asap, but with PY_VERSION_HEX < 0x030C0000
, so that we get Python 3.12 on the right track even before day 1, never will have to worry about related backwards compatibility questions for anything >= 3.12, and pre-release testing with 3.12 will never run into issues around using builtins
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does pybind11 actually compile with Python 3.12? My guess would be some internal details will need to change just like with 3.11.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: it compiles with cpython/main
, though there are apparently some new failures in the test suite.
RROR tests/test_custom_type_setup.py::test_self_cycle - pytest.PytestUnraisableExceptionWarning: Exception ignor...
ERROR tests/test_custom_type_setup.py::test_indirect_cycle - pytest.PytestUnraisableExceptionWarning: Exception i...
FAILED tests/test_call_policies.py::test_alive_gc - pytest.PytestUnraisableExceptionWarning: Exception ignored in...
FAILED tests/test_methods_and_attributes.py::test_dynamic_attributes - pytest.PytestUnraisableExceptionWarning: E...
FAILED tests/test_methods_and_attributes.py::test_cyclic_gc - pytest.PytestUnraisableExceptionWarning: Exception ...
FAILED tests/test_multiple_inheritance.py::test_mi_dynamic_attributes - pytest.PytestUnraisableExceptionWarning: ...
FAILED tests/test_pickling.py::test_roundtrip_with_dict[PickleableWithDict] - pytest.PytestUnraisableExceptionWar...
FAILED tests/test_pickling.py::test_roundtrip_with_dict[PickleableWithDictNew] - pytest.PytestUnraisableException...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say that wide-scale deployment of 3.12 is likely so far off that an ABI bump in pybind11 is perhaps the closer thing to aim for with regards to this PR.
That's a quite strange error for the PyPy 3.7 CI run. Did anybody see something like this before? Not sure how it could be related to this PR (which in principle does not change the behavior of PyPy builds). |
To follow up on @rwgk's comment: the implementation of this PR leaves Python 3.7 with the previous approach to stash the capsule in the builtins dict. In this way, it is no longer necessary to drop the older Python version. Using different implementations in Python 3.7 vs 3.8+ should also be fine -- they don't need to share a compatible ABI because modules linked with each flavor won't talk to each other. |
I've never seen that error before. Very very strange. |
I clicked the rerun button for Win PyPy 3.7 after downloading the current log archive. Two goals:
|
@@ -401,9 +401,30 @@ inline void translate_local_exception(std::exception_ptr p) { | |||
} | |||
#endif | |||
|
|||
inline object get_internals_state_dict() { | |||
object state_dict; | |||
#if PY_VERSION_HEX < 0x03080000 || defined(PYPY_VERSION) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out it is extremely easy to invite the Python 3.12 core developers to test with pybind11, with a 1 or 2 line change here, depending on how you count. Diff below.
I tested that diff with a local installation of Python 3.12.0a1. Log also below.
diff --git a/include/pybind11/detail/internals.h b/include/pybind11/detail/internals.h
index 03c10506..c8ac22e7 100644
--- a/include/pybind11/detail/internals.h
+++ b/include/pybind11/detail/internals.h
@@ -403,7 +403,8 @@ inline void translate_local_exception(std::exception_ptr p) {
inline object get_internals_state_dict() {
object state_dict;
-#if PY_VERSION_HEX < 0x03080000 || defined(PYPY_VERSION)
+#if (PYBIND11_INTERNALS_VERSION <= 4 && PY_VERSION_HEX < 0x030C0000) \
+ || PY_VERSION_HEX < 0x03080000 || defined(PYPY_VERSION)
state_dict = reinterpret_borrow<object>(PyEval_GetBuiltins());
#else
# if PY_VERSION_HEX < 0x03090000
+ /usr/local/google/home/rwgk/usr_local_like/Python-3.12.0a1/bin/python3 /usr/local/google/home/rwgk/clone/pybind11_scons/run_tests.py ../pybind11
Running tests in directory "/usr/local/google/home/rwgk/forked/pybind11/tests/test_embed":
===============================================================================
All tests passed (1554 assertions in 12 test cases)
Running tests in directory "/usr/local/google/home/rwgk/forked/pybind11/tests":
=========================================================== test session starts ============================================================
platform linux -- Python 3.12.0a1, pytest-7.2.0, pluggy-1.0.0
C++ Info: Debian Clang 14.0.6 C++17 __pybind11_internals_v4_clang_libstdcpp_cxxabi1002__
rootdir: /usr/local/google/home/rwgk/forked/pybind11/tests, configfile: pytest.ini
collected 698 items
test_async.py .. [ 0%]
test_buffers.py ......... [ 1%]
test_builtin_casters.py .................... [ 4%]
test_call_policies.py ........ [ 5%]
test_callbacks.py ............ [ 7%]
test_chrono.py ........................................... [ 13%]
test_class.py ............................... [ 17%]
test_const_name.py ...................... [ 21%]
test_constants_and_functions.py ..... [ 21%]
test_copy_move.py ........ [ 22%]
test_custom_type_casters.py ... [ 23%]
test_custom_type_setup.py .. [ 23%]
test_docstring_options.py . [ 23%]
test_eigen_matrix.py .....................ss....... [ 28%]
test_eigen_tensor.py ............................................................................................................. [ 43%]
test_enum.py ......... [ 44%]
test_eval.py .... [ 45%]
test_exceptions.py ...................... [ 48%]
test_factory_constructors.py ............... [ 50%]
test_gil_scoped.py ..... [ 51%]
test_iostream.py ...................... [ 54%]
test_kwargs_and_defaults.py ........ [ 55%]
test_local_bindings.py .......... [ 57%]
test_methods_and_attributes.py ...................... [ 60%]
test_modules.py ....... [ 61%]
test_multiple_inheritance.py .................. [ 64%]
test_numpy_array.py ......................................................... [ 72%]
test_numpy_dtypes.py ............... [ 74%]
test_numpy_vectorize.py ........ [ 75%]
test_opaque_types.py ... [ 75%]
test_operator_overloading.py ..... [ 76%]
test_pickling.py ........ [ 77%]
test_pytypes.py .................................................................................. [ 89%]
test_sequences_and_iterators.py .............. [ 91%]
test_smart_ptr.py ............. [ 93%]
test_stl.py .........s............. [ 96%]
test_stl_binders.py ......... [ 97%]
test_tagbased_polymorphic.py . [ 98%]
test_thread.py .. [ 98%]
test_union.py . [ 98%]
test_virtual_functions.py .......... [100%]
========================================================= short test summary info ==========================================================
SKIPPED [1] test_eigen_matrix.py:718: could not import 'scipy': No module named 'scipy'
SKIPPED [1] test_eigen_matrix.py:728: could not import 'scipy': No module named 'scipy'
SKIPPED [1] test_stl.py:143: no <experimental/optional>
===================================================== 695 passed, 3 skipped in 10.04s ======================================================
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also triggered CI testing under #4307, after git rebase master.
No, that error did NOT reproduce, but another flake slipped in, that I believe I've seen several times before, although it is not the most common one. Trying a 3rd time.
It turns out it works differently:
The new log archive only has the new log, not any of the old ones, although the directory structure is still there, but with lots of empty subdirectories. Certain files for jobs other than the one that was rerun have new timestamps. Not easy to work with, unfortunately. I just see: the 3rd attempt was successful! |
Modifications are: * Backward compatibility (no ABI break), as originally under PR pybind#4307. * Naming: `get_python_state_dict()`, `has_pybind11_internals_capsule()` * Report error retrieving `internals**` from capsule instead of clearing it. Locally tested with ASAN, MSAN, TSAN, UBSAN (Google-internal toolchain).
// name. We clear the error status below in that case | ||
internals_pp = static_cast<internals **>(PyCapsule_GetPointer(o.ptr(), id_cstr)); | ||
if (!internals_pp) { | ||
PyErr_Clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR (with the #4307 tweak) is now baked into PR #4329:
Copying the commit message here (I think the link will stop working in case I have to rebase):
Modified version of PR #4293 by @wjakob
Modifications are:
* Backward compatibility (no ABI break), as originally under PR #4307.
* Naming: `get_python_state_dict()`, `has_pybind11_internals_capsule()`
* Report error retrieving `internals**` from capsule instead of clearing it.
Locally tested with ASAN, MSAN, TSAN, UBSAN (Google-internal toolchain).
My commit changes the code here, to report the error rather than suppressing it.
What is the rationale for suppressing it?
object o = state_dict[id]; | ||
// May fail if 'capsule_obj' is not a capsule, or if it has a different | ||
// name. We clear the error status below in that case | ||
internals_pp = static_cast<internals **>(PyCapsule_GetPointer(o.ptr(), id_cstr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the raw CAPI here? Any particular reason we are changing it from pytype.h API it was before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my current version:
void *raw_ptr = PyCapsule_GetPointer(state_dict[id].ptr(), id_cstr);
if (raw_ptr == nullptr) {
raise_from(
PyExc_SystemError,
"pybind11::detail::get_internals(): Retrieve internals** from capsule FAILED");
}
internals_pp = static_cast<internals **>(raw_ptr);
The nice thing is that PyCapsule_GetPointer()
does everything "just right" in one simple line. Additionally, in this particular situation I'd definitely want to avoid the throw PYBIND11_OBJECT_CHECK_FAILED()
that comes with the capsule
constructor.
I don't have a good idea for using capsule
in an elegant way here TBH.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While looking around more (work under PR #4329) I noticed this code in embed.h (finalize_interpreter()
):
handle builtins(PyEval_GetBuiltins());
...
if (builtins.contains(id) && isinstance<capsule>(builtins[id])) {
internals_ptr_ptr = capsule(builtins[id]);
}
Two things learned:
- That needs to be changed, too, to inspect the correct dict.
- The implementation is a bit on the high-level overkill side, scraping by an
if (...) throw
, but it is elegant!
Closed in favor of #4570 |
Following the discussion in wjakob/nanobind#96, pybind11 should similarly stop stashing its internal data structures in the global
builtins
dictionary.It would be nice to include this patch with an ABI bump for an upcoming release.
Suggested changelog entry: