Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-94673: More Per-Interpreter Fields for Builtin Static Types #103912

Conversation

ericsnowcurrently
Copy link
Member

@ericsnowcurrently ericsnowcurrently commented Apr 27, 2023

This involves moving tp_dict, tp_bases, and tp_mro to PyInterpreterState, in the same way we did for tp_subclasses. Those three fields are effectively const for builtin static types (unlike tp_subclasses). In theory we only need to make their values immortal, along with their contents. However, that isn't such a simple proposition. (See gh-103823.) In the meantime the simplest solution is to move the fields into the interpreter.

One alternative is to statically allocate the values, but that's its own can of worms.

@ericsnowcurrently ericsnowcurrently force-pushed the per-interpreter-static-types-fields branch from 7ccc2e4 to 9937406 Compare May 2, 2023 02:56
@ericsnowcurrently ericsnowcurrently force-pushed the per-interpreter-static-types-fields branch from 9937406 to 071ef3f Compare May 2, 2023 03:19
@ericsnowcurrently ericsnowcurrently added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 2, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @ericsnowcurrently for commit 071ef3f 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 2, 2023
@ericsnowcurrently ericsnowcurrently added the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label May 2, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @ericsnowcurrently for commit cd1dd10 🤖

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label May 2, 2023
@ericsnowcurrently ericsnowcurrently force-pushed the per-interpreter-static-types-fields branch from 3c4000c to 2771f4e Compare May 3, 2023 02:56
@rwgk
Copy link

rwgk commented May 20, 2023

git bisect got me here.

I'm testing Python 3.12 with pybind11 (master @ https://github.com/pybind/pybind11/tree/d72ffb448c58b4ffb08b5ad629bc788646e2d59e).

# first bad commit: [de64e7561680fdc5358001e9488091e75d4174a3] gh-94673: More Per-Interpreter Fields for Builtin Static Types (gh-103912)

I double-checked that this is true:

The gdb backtrace is below. #0 & #1 are pointing here:

Does this ring any bells? Is there something obvious that we need to do differently in pybind11?

Steps to reproduce are involved, roughly:

  • gcc (Debian 12.2.0-14) 12.2.0
  • install cpython from scratch
  • pip install setuptools
  • pytest installation from git main branch: python3 setup.py install
  • building pybind11 unit tests (succeeds)
  • pytest then crashes with a segfault at startup

I can send more details or try to reduce as needed. Please let me know.

#0  0x00007ffff592a646 in Py_TYPE (ob=0x0) at /usr/local/google/home/rwgk/usr_local_like/cpython_git_bisect/include/python3.12/object.h:204
#1  0x00007ffff5930245 in pybind11::detail::iterator_policies::sequence_fast_readonly::sequence_fast_readonly (this=0x7fffffffa8d8, 
    obj=..., n=0) at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/detail/../detail/../pytypes.h:1168
#2  0x00007ffff5940c9b in pybind11::detail::generic_iterator<pybind11::detail::iterator_policies::sequence_fast_readonly>::generic_iterator
    (this=0x7fffffffa8d8, seq=..., index=0)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/detail/../detail/../pytypes.h:1095
#3  0x00007ffff5931401 in pybind11::tuple::begin (this=0x7fffffffa970)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/detail/../detail/../pytypes.h:1983
#4  0x00007ffff59339ba in pybind11::detail::all_type_info_populate (t=0x555555aa1f00 <PyBaseObject_Type>, 
    bases=std::vector of length 0, capacity 0)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/detail/../detail/type_caster_base.h:108
#5  0x00007ffff5933e2c in pybind11::detail::all_type_info (type=0x555555aa1f00 <PyBaseObject_Type>)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/detail/../detail/type_caster_base.h:173
#6  0x00007ffff5933e56 in pybind11::detail::get_type_info (type=0x555555aa1f00 <PyBaseObject_Type>)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/detail/../detail/type_caster_base.h:185
#7  0x00007ffff593eda5 in pybind11::detail::generic_type::mark_parents_nonsimple (this=0x7fffffffb038, value=0x555556021f90)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/pybind11.h:1368
#8  0x00007ffff593ede2 in pybind11::detail::generic_type::mark_parents_nonsimple (this=0x7fffffffb038, value=0x5555560cd900)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/pybind11.h:1372
#9  0x00007ffff593ede2 in pybind11::detail::generic_type::mark_parents_nonsimple (this=0x7fffffffb038, value=0x5555560ceb20)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/pybind11.h:1372
#10 0x00007ffff593ea53 in pybind11::detail::generic_type::initialize (this=0x7fffffffb038, rec=...)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/pybind11.h:1346
#11 0x00007ffff5cebccc in pybind11::class_<test_submodule_multiple_inheritance(pybind11::module_&)::Base12, test_submodule_multiple_inheritance(pybind11::module_&)::Base1, test_submodule_multiple_inheritance(pybind11::module_&)::Base2>::class_<>(pybind11::handle, const char *) (
    this=0x7fffffffb038, scope=..., name=0x7ffff6167b1f "Base12")
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/pybind11.h:1553
#12 0x00007ffff5ce98f9 in test_submodule_multiple_inheritance (m=...)
    at /usr/local/google/home/rwgk/forked/pybind11/tests/test_multiple_inheritance.cpp:113
#13 0x00007ffff592aaf6 in operator() (__closure=0x555555fb0da0, parent=...)
    at /usr/local/google/home/rwgk/forked/pybind11/tests/pybind11_tests.cpp:40
#14 0x00007ffff592ce8b in std::__invoke_impl<void, test_initializer::test_initializer(char const*, Initializer)::<lambda(pybind11::module_&)>&, pybind11::module_&>(std::__invoke_other, struct {...} &) (__f=...) at /usr/include/c++/12/bits/invoke.h:61
#15 0x00007ffff592cd56 in std::__invoke_r<void, test_initializer::test_initializer(char const*, Initializer)::<lambda(pybind11::module_&)>&, pybind11::module_&>(struct {...} &) (__fn=...) at /usr/include/c++/12/bits/invoke.h:111
#16 0x00007ffff592cc25 in std::_Function_handler<void(pybind11::module_&), test_initializer::test_initializer(char const*, Initializer)::<lambda(pybind11::module_&)> >::_M_invoke(const std::_Any_data &, pybind11::module_ &) (__functor=..., __args#0=...)
    at /usr/include/c++/12/bits/std_function.h:290
#17 0x00007ffff59468a5 in std::function<void (pybind11::module_&)>::operator()(pybind11::module_&) const (this=0x555555fb0da0, 
    __args#0=...) at /usr/include/c++/12/bits/std_function.h:591
#18 0x00007ffff592b433 in pybind11_init_pybind11_tests (m=...) at /usr/local/google/home/rwgk/forked/pybind11/tests/pybind11_tests.cpp:121
#19 0x00007ffff592adfb in PyInit_pybind11_tests () at /usr/local/google/home/rwgk/forked/pybind11/tests/pybind11_tests.cpp:78
#20 0x0000555555802cba in _PyImport_LoadDynamicModuleWithSpec (spec=spec@entry=0x7ffff66e22a0, fp=fp@entry=0x0) at ./Python/importdl.c:169
#21 0x00005555557fdcb2 in _imp_create_dynamic_impl (module=<optimized out>, file=0x0, spec=0x7ffff66e22a0) at Python/import.c:3721
#22 _imp_create_dynamic (module=<optimized out>, args=<optimized out>, nargs=<optimized out>) at Python/clinic/import.c.h:506
#23 0x00005555557104e3 in cfunction_vectorcall_FASTCALL (func=0x7ffff7c72b10, args=0x7ffff66e1768, nargsf=<optimized out>, 
    kwnames=<optimized out>) at ./Include/cpython/methodobject.h:50
#24 0x0000555555650290 in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=0x7ffff7fb9678, throwflag=<optimized out>)
    at Python/bytecodes.c:3125
#25 0x00005555556b5cc7 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=2, args=0x7fffffffba70, callable=0x7ffff7c47f60, 
    tstate=0x555555bf2208 <_PyRuntime+460904>) at ./Include/internal/pycore_call.h:92
#26 object_vacall (tstate=tstate@entry=0x555555bf2208 <_PyRuntime+460904>, base=base@entry=0x0, callable=0x7ffff7c47f60, 
    vargs=vargs@entry=0x7fffffffbaf8) at Objects/call.c:818
#27 0x00005555556b73c0 in PyObject_CallMethodObjArgs (obj=0x0, name=<optimized out>) at Objects/call.c:879
#28 0x0000555555800dce in import_find_and_load (abs_name=0x7ffff66fdbb0, tstate=0x555555bf2208 <_PyRuntime+460904>) at Python/import.c:2715
#29 PyImport_ImportModuleLevelObject (name=name@entry=0x7ffff66fdbb0, globals=<optimized out>, locals=<optimized out>, 
    fromlist=fromlist@entry=0x555555a9e460 <_Py_NoneStruct>, level=0) at Python/import.c:2798
#30 0x000055555565a49b in import_name (level=0x555555b82620 <_PyRuntime+3200>, fromlist=0x555555a9e460 <_Py_NoneStruct>, 
    name=0x7ffff66fdbb0, frame=0x7ffff7fb9318, tstate=<optimized out>) at Python/ceval.c:2350
#31 _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=0x7ffff7fb9318, throwflag=<optimized out>) at Python/bytecodes.c:2000
#32 0x00005555557cb467 in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff7fb9318, tstate=0x555555bf2208 <_PyRuntime+460904>)
    at ./Include/internal/pycore_ceval.h:88
#33 _PyEval_Vector (args=0x0, argcount=0, kwnames=0x0, locals=0x7ffff66fc340, func=0x7ffff66e9620, 
    tstate=0x555555bf2208 <_PyRuntime+460904>) at Python/ceval.c:1575
#34 PyEval_EvalCode (co=co@entry=0x555555fee3c0, globals=globals@entry=0x7ffff66fc340, locals=locals@entry=0x7ffff66fc340)
    at Python/ceval.c:566
#35 0x00005555557c7640 in builtin_exec_impl (module=<optimized out>, closure=<optimized out>, locals=0x7ffff66fc340, 
    globals=0x7ffff66fc340, source=0x555555fee3c0) at Python/bltinmodule.c:1079
#36 builtin_exec (module=<optimized out>, args=<optimized out>, nargs=<optimized out>, kwnames=<optimized out>)
    at Python/clinic/bltinmodule.c.h:586
#37 0x000055555571025f in cfunction_vectorcall_FASTCALL_KEYWORDS (func=0x7ffff7c716c0, args=0x7ffff7fb92f8, nargsf=<optimized out>, 
    kwnames=<optimized out>) at ./Include/cpython/methodobject.h:50
#38 0x00005555556b6150 in _PyObject_VectorcallTstate (kwnames=0x555555b93e18 <_PyRuntime+74872>, nargsf=<optimized out>, 
    args=0x555555a9e460 <_Py_NoneStruct>, callable=0x7ffff7c716c0, tstate=0x555555bf2208 <_PyRuntime+460904>)
    at ./Include/internal/pycore_call.h:92
#39 PyObject_Vectorcall (callable=callable@entry=0x7ffff7c716c0, args=args@entry=0x7ffff7fb92f8, nargsf=<optimized out>, 
    kwnames=kwnames@entry=0x0) at Objects/call.c:291
#40 0x000055555565477b in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=0x7ffff7fb9248, throwflag=<optimized out>)
    at Python/bytecodes.c:2577
#41 0x00005555556b9c41 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=2, args=0x7fffffffc0e0, callable=0x7ffff6bd6e80, 
    tstate=0x555555bf2208 <_PyRuntime+460904>) at ./Include/internal/pycore_call.h:92
#42 method_vectorcall (method=<optimized out>, args=0x7ffff66c2608, nargsf=<optimized out>, kwnames=0x0) at Objects/classobject.c:89
#43 0x0000555555650290 in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=0x7ffff7fb89c0, throwflag=<optimized out>)
    at Python/bytecodes.c:3125
#44 0x00005555556b815f in _PyObject_FastCallDictTstate (kwargs=0x7ffff66bdc00, nargsf=<optimized out>, args=0x7fffffffc300, 
    callable=0x7ffff7597240, tstate=0x555555bf2208 <_PyRuntime+460904>) at Objects/call.c:144
#45 _PyObject_Call_Prepend (tstate=tstate@entry=0x555555bf2208 <_PyRuntime+460904>, callable=callable@entry=0x7ffff7597240, 
    obj=obj@entry=0x7ffff6806a80, args=args@entry=0x555555b93db0 <_PyRuntime+74768>, kwargs=kwargs@entry=0x7ffff66f3ac0)
    at Objects/call.c:476
#46 0x0000555555737d4d in slot_tp_call (self=0x7ffff6806a80, args=0x555555b93db0 <_PyRuntime+74768>, kwds=0x7ffff66f3ac0)
    at Objects/typeobject.c:8474
#47 0x00005555556b5874 in _PyObject_MakeTpCall (tstate=0x555555bf2208 <_PyRuntime+460904>, callable=0x7ffff6806a80, args=0x7ffff7fb8850, 
    nargs=<optimized out>, keywords=0x7ffff6b6a800) at Objects/call.c:206
#48 0x00005555556b61af in _PyObject_VectorcallTstate (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, 
    callable=<optimized out>, tstate=<optimized out>) at ./Include/internal/pycore_call.h:90
#49 _PyObject_VectorcallTstate (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<optimized out>, 
    tstate=<optimized out>) at ./Include/internal/pycore_call.h:77
#50 PyObject_Vectorcall (callable=<optimized out>, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>)
    at Objects/call.c:291
#51 0x0000555555e9b056 in ?? ()
#52 0x00007ffff66b3980 in ?? ()
#53 0x00007fff00000003 in ?? ()
#54 0x0000555555a122e0 in ?? ()
#55 0x00007ffff7fb8850 in ?? ()
#56 0x0000000000000000 in ?? ()

@markshannon
Copy link
Member

You are deferencing a NULL pointer, in pybind11 code.
Why is this a CPython bug? (I'm not saying it isn't, but the backtrace shows mostly pybind11 code)

Is there some field that is expected to be non-NULL and is now NULL?

@rwgk
Copy link

rwgk commented May 22, 2023

You are deferencing a NULL pointer, in pybind11 code. Why is this a CPython bug? (I'm not saying it isn't, but the backtrace shows mostly pybind11 code)

Is there some field that is expected to be non-NULL and is now NULL?

I didn't mean to suggest it's a CPython bug. All we know at the moment is that the pybind11 unit tests do not load anymore after this PR was merged. I didn't want to dive in (possibly spending a significant amount of time) without asking here first, in case it's something obvious to you.

I'll look closer to get a better understanding.

@rwgk
Copy link

rwgk commented May 23, 2023

Attached is a much smaller reproducer and shorter gdb backtrace, JIC this rings any bells.

The crash is in the context of pybind11 multiple inheritance code, which hasn't changed in any significant way for years, although there was some back and forth around the time 3.11 was released. I still have to dig up references.

  • With 872cbc6 ("last good") the exit code is 0.
  • With de64e75 ("first bad") the exit code is -11 (segfault).
#include <pybind11/embed.h>
#include <pybind11/pybind11.h>

namespace py = pybind11;

PYBIND11_EMBEDDED_MODULE(mi_debugging, m) {
    struct Base1 {};
    struct Base2 {};
    struct Base12 : Base1, Base2 {};

    py::class_<Base1> b1(m, "Base1");
    py::class_<Base2> b2(m, "Base2");
    py::class_<Base12, Base1, Base2>(m, "Base12");
}

int main() {
    py::initialize_interpreter();

    py::module_::import("mi_debugging");

    py::finalize_interpreter();

    return 0;
}
#YINST=segfault_20230520_last_good
PYINST=segfault_20230520_first_bad
g++ -o main_debugging.o -c -std=c++17 -O0 -g -Wall -Wextra -Wconversion -Wcast-qual -Wdeprecated -Wundef -Wnon-virtual-dtor -Wunused-result -Werror -I/usr/local/google/home/rwgk/forked/pybind11/include -I/usr/local/google/home/rwgk/usr_local_like/$PYINST/include/python3.12 main_debugging.cpp
g++ -o main_debugging -L/usr/local/google/home/rwgk/usr_local_like/$PYINST/lib -rdynamic -O0 -g main_debugging.o -lpython3.12 -lpthread -ldl -lutil
#0  0x0000555555687f26 in Py_TYPE (ob=0x0)
    at /usr/local/google/home/rwgk/usr_local_like/segfault_20230520_first_bad/include/python3.12/object.h:204
#1  0x000055555568cff9 in pybind11::detail::iterator_policies::sequence_fast_readonly::sequence_fast_readonly (this=0x7fffffffc6b8, 
    obj=..., n=0) at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/detail/../detail/../pytypes.h:1168
#2  0x000055555569bd45 in pybind11::detail::generic_iterator<pybind11::detail::iterator_policies::sequence_fast_readonly>::generic_iterator (this=0x7fffffffc6b8, seq=..., index=0)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/detail/../detail/../pytypes.h:1095
#3  0x000055555568e165 in pybind11::tuple::begin (this=0x7fffffffc750)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/detail/../detail/../pytypes.h:1983
#4  0x000055555569069e in pybind11::detail::all_type_info_populate (t=0x555555b12560 <PyBaseObject_Type>, 
    bases=std::vector of length 0, capacity 0)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/detail/../detail/type_caster_base.h:108
#5  0x0000555555690b10 in pybind11::detail::all_type_info (type=0x555555b12560 <PyBaseObject_Type>)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/detail/../detail/type_caster_base.h:173
#6  0x0000555555690b3a in pybind11::detail::get_type_info (type=0x555555b12560 <PyBaseObject_Type>)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/detail/../detail/type_caster_base.h:185
#7  0x000055555569aa45 in pybind11::detail::generic_type::mark_parents_nonsimple (this=0x7fffffffcc68, value=0x555555cef4e0)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/pybind11.h:1368
#8  0x000055555569aa82 in pybind11::detail::generic_type::mark_parents_nonsimple (this=0x7fffffffcc68, value=0x555555ce8790)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/pybind11.h:1372
#9  0x000055555569aa82 in pybind11::detail::generic_type::mark_parents_nonsimple (this=0x7fffffffcc68, value=0x555555ce9300)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/pybind11.h:1372
#10 0x000055555569a6f3 in pybind11::detail::generic_type::initialize (this=0x7fffffffcc68, rec=...)
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/pybind11.h:1346
#11 0x00005555556887fc in pybind11::class_<pybind11_init_mi_debugging(pybind11::module_&)::Base12, pybind11_init_mi_debugging(pybind11::module_&)::Base1, pybind11_init_mi_debugging(pybind11::module_&)::Base2>::class_<>(pybind11::handle, const char *) (
    this=0x7fffffffcc68, scope=..., name=0x55555591bcf2 "Base12")
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/pybind11.h:1553
#12 0x0000555555688407 in pybind11_init_mi_debugging (m=...) at main_debugging.cpp:13
#13 0x000055555568825e in pybind11_init_wrapper_mi_debugging () at main_debugging.cpp:6
#14 0x000055555568834e in pybind11_init_impl_mi_debugging () at main_debugging.cpp:6
#15 0x00005555557e663c in create_builtin (name=name@entry=0x7ffff7502c30, spec=spec@entry=0x7ffff7a155e0, tstate=<optimized out>)
    at Python/import.c:1355
#16 0x00005555557e6791 in create_builtin (spec=0x7ffff7a155e0, name=0x7ffff7502c30, tstate=0x555555c65548 <_PyRuntime+460904>)
    at ./Include/object.h:204
#17 _imp_create_builtin (module=<optimized out>, spec=0x7ffff7a155e0) at Python/import.c:3364
#18 0x000055555572776b in cfunction_vectorcall_O (func=0x7ffff79aa890, args=0x7ffff79f8598, nargsf=<optimized out>, 
    kwnames=<optimized out>) at ./Include/cpython/methodobject.h:50
#19 0x000055555567c7b0 in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=0x7ffff7fb82f8, throwflag=<optimized out>)
    at Python/bytecodes.c:3125
#20 0x00005555556d8547 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=2, args=0x7fffffffcf10, callable=0x7ffff797ff60, 
    tstate=0x555555c65548 <_PyRuntime+460904>) at ./Include/internal/pycore_call.h:92
#21 object_vacall (tstate=tstate@entry=0x555555c65548 <_PyRuntime+460904>, base=base@entry=0x0, callable=0x7ffff797ff60, 
    vargs=vargs@entry=0x7fffffffcf98) at Objects/call.c:818
#22 0x00005555556d9c40 in PyObject_CallMethodObjArgs (obj=0x0, name=<optimized out>) at Objects/call.c:879
#23 0x00005555557e82ce in import_find_and_load (abs_name=0x7ffff7502c30, tstate=0x555555c65548 <_PyRuntime+460904>)
    at Python/import.c:2715
#24 PyImport_ImportModuleLevelObject (name=name@entry=0x7ffff7502c30, globals=globals@entry=0x7ffff7a0d300, 
    locals=locals@entry=0x7ffff7a0d300, fromlist=fromlist@entry=0x7ffff79c0900, level=0) at Python/import.c:2798
#25 0x00005555557acd9e in builtin___import___impl (level=<optimized out>, fromlist=0x7ffff79c0900, locals=0x7ffff7a0d300, 
    globals=0x7ffff7a0d300, name=0x7ffff7502c30, module=<optimized out>) at Python/bltinmodule.c:275
#26 builtin___import__ (module=<optimized out>, args=<optimized out>, nargs=<optimized out>, kwnames=<optimized out>)
    at Python/clinic/bltinmodule.c.h:107
#27 0x0000555555727bef in cfunction_vectorcall_FASTCALL_KEYWORDS (func=0x7ffff79a92b0, args=0x7fffffffd1d0, nargsf=<optimized out>, 
    kwnames=<optimized out>) at ./Include/cpython/methodobject.h:50
#28 0x00005555556d872d in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x7fffffffd1d0, 
    callable=0x7ffff79a92b0, tstate=0x555555c65548 <_PyRuntime+460904>) at ./Include/internal/pycore_call.h:92
#29 _PyObject_CallFunctionVa (tstate=0x555555c65548 <_PyRuntime+460904>, callable=0x7ffff79a92b0, format=<optimized out>, 
    va=<optimized out>, is_size_t=<optimized out>) at Objects/call.c:530
#30 0x00005555556d8f0e in PyObject_CallFunction (callable=callable@entry=0x7ffff79a92b0, format=format@entry=0x555555967350 "OOOOi")
    at Objects/call.c:552
#31 0x00005555557e8ec7 in PyImport_Import (module_name=module_name@entry=0x7ffff7502c30) at Python/import.c:2984
#32 0x00005555557e911b in PyImport_ImportModule (name=<optimized out>) at Python/import.c:2424
#33 0x0000555555699ef6 in pybind11::module_::import (name=0x55555591bcc3 "mi_debugging")
    at /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/pybind11.h:1212
#34 0x0000555555688495 in main () at main_debugging.cpp:19

@rwgk
Copy link

rwgk commented May 23, 2023

although there was some back and forth around the time 3.11 was released. I still have to dig up references.

Found it: https://github.com/pybind/pybind11/pull/4142/files (NOT merged)

I just applied that change locally, but it doesn't make a difference: still exit 0 with last good, segfault with first bad.

If anybody has conclusive advice regarding Py_TPFLAGS_MANAGED_DICT in pybind11 that would be great to know (keep or not, it was never clear to me).

@rwgk
Copy link

rwgk commented May 23, 2023

The pybind11 patch below fixes the segfault. All pybind11 unit tests pass (excluding those depending on numpy).

From what I saw while debugging, it appears tp_bases of object was an empty tuple in the past, but is now NULL, could that be? Is that intentional?

diff --git a/include/pybind11/detail/type_caster_base.h b/include/pybind11/detail/type_caster_base.h
index 16387506..bfb42063 100644
--- a/include/pybind11/detail/type_caster_base.h
+++ b/include/pybind11/detail/type_caster_base.h
@@ -105,8 +105,10 @@ all_type_info_get_cache(PyTypeObject *type);
 // Populates a just-created cache entry.
 PYBIND11_NOINLINE void all_type_info_populate(PyTypeObject *t, std::vector<type_info *> &bases) {
     std::vector<PyTypeObject *> check;
-    for (handle parent : reinterpret_borrow<tuple>(t->tp_bases)) {
-        check.push_back((PyTypeObject *) parent.ptr());
+    if (t->tp_bases) {
+        for (handle parent : reinterpret_borrow<tuple>(t->tp_bases)) {
+            check.push_back((PyTypeObject *) parent.ptr());
+        }
     }

     auto const &type_dict = get_internals().registered_types_py;
diff --git a/include/pybind11/pybind11.h b/include/pybind11/pybind11.h
index 28ebc222..7466e75f 100644
--- a/include/pybind11/pybind11.h
+++ b/include/pybind11/pybind11.h
@@ -1363,6 +1363,9 @@ protected:

     /// Helper function which tags all parents of a type using mult. inheritance
     void mark_parents_nonsimple(PyTypeObject *value) {
+        if (value->tp_bases == nullptr) {
+            return;
+        }
         auto t = reinterpret_borrow<tuple>(value->tp_bases);
         for (handle h : t) {
             auto *tinfo2 = get_type_info((PyTypeObject *) h.ptr());

@sunmy2019
Copy link
Member

Is there some field that is expected to be non-NULL and is now NULL?

NULL indicates the correct value is stored elsewhere. Currently, there are only private APIs that can get access to these values.

Namely, we need to make _PyType_GetBases public.

@rwgk
Copy link

rwgk commented May 27, 2023

Is there some field that is expected to be non-NULL and is now NULL?

NULL indicates the correct value is stored elsewhere. Currently, there are only private APIs that can get access to these values.

Namely, we need to make _PyType_GetBases public.

That's beginning to look like a major API change. Is tp_bases interpreter-specific?

@sunmy2019
Copy link
Member

Is tp_bases interpreter-specific?

After this PR, yes, for builtin static types.

Look at the PR title.

@rwgk
Copy link

rwgk commented May 27, 2023

  • @wjakob @gpshead, who can probably gauge better than I can what this means for binding tools like pybind11, nanobind, SWIG, cython, CLIF (Google) etc.

On one end of the spectrum:

Will we be OK with something like my patch under #103912 (comment) (because we don't need the bases of "Builtin Static Types")?

On the other end of the spectrum:

Will we have to make deeper changes in in the binding tools to handle interpreter-specific fields?

@encukou
Copy link
Member

encukou commented May 30, 2023

tp_bases docs still say it's a tuple of bases: https://docs.python.org/3.12/c-api/typeobj.html#c.PyTypeObject.tp_bases
This looks like a breaking change.

@encukou
Copy link
Member

encukou commented May 30, 2023

I remember a discussion on changing tp_subclasses, which was explicitly documented “for internal use only” (and ts existing docs were incorrect). Was there a similar discussion for the others?
For tp_subclasses the docs were changed and the change is mentioned in What's New, but that's missing here.

ericsnowcurrently added a commit that referenced this pull request May 31, 2023
…tic Builtin Types (gh-105115)

In gh-103912 we added tp_bases and tp_mro to each PyInterpreterState.types.builtins entry.  However, doing so ignored the fact that both PyTypeObject fields are public API, and not documented as internal (as opposed to tp_subclasses).  We address that here by reverting back to shared objects, making them immortal in the process.
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request May 31, 2023
…ll Static Builtin Types (pythongh-105115)

In pythongh-103912 we added tp_bases and tp_mro to each PyInterpreterState.types.builtins entry.  However, doing so ignored the fact that both PyTypeObject fields are public API, and not documented as internal (as opposed to tp_subclasses).  We address that here by reverting back to shared objects, making them immortal in the process.
(cherry picked from commit 7be667d)

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
ericsnowcurrently pushed a commit that referenced this pull request Jun 1, 2023
…All Static Builtin Types (gh-105115) (gh-105124)

In gh-103912 we added tp_bases and tp_mro to each PyInterpreterState.types.builtins entry.  However, doing so ignored the fact that both PyTypeObject fields are public API, and not documented as internal (as opposed to tp_subclasses).  We address that here by reverting back to shared objects, making them immortal in the process.
(cherry picked from commit 7be667d)

Co-authored-by: Eric Snow ericsnowcurrently@gmail.com
ericsnowcurrently added a commit that referenced this pull request Jun 1, 2023
gh-105122)

When I added the relevant condition to type_ready_set_bases() in gh-103912, I had missed that the function also sets tp_base and ob_type (if necessary).  That led to problems for third-party static types.

We fix that here, by making those extra operations distinct and by adjusting the condition to be more specific.
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jun 1, 2023
…Ready() (pythongh-105122)

When I added the relevant condition to type_ready_set_bases() in pythongh-103912, I had missed that the function also sets tp_base and ob_type (if necessary).  That led to problems for third-party static types.

We fix that here, by making those extra operations distinct and by adjusting the condition to be more specific.
(cherry picked from commit 1469393)

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
ericsnowcurrently pushed a commit that referenced this pull request Jun 1, 2023
…_Ready() (gh-105122) (gh-105211)

When I added the relevant condition to type_ready_set_bases() in gh-103912, I had missed that the function also sets tp_base and ob_type (if necessary).  That led to problems for third-party static types.

We fix that here, by making those extra operations distinct and by adjusting the condition to be more specific.
(cherry picked from commit 1469393)

Co-authored-by: Eric Snow ericsnowcurrently@gmail.com
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants