Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-115649: Copy the filename into main interpreter before intern in import.c #120315

Merged
merged 5 commits into from
Jun 17, 2024

Conversation

aisk
Copy link
Member

@aisk aisk commented Jun 10, 2024

Test code:

import _interpreters

iid = _interpreters.create()
_interpreters.exec(iid, "import _tkinter")
_interpreters.destroy(iid)

Execute result:

$ ./python c.py
Debug memory block at address p=0x7899b89e3520: API '�'
    15987178197214944733 bytes originally requested
    The 7 pad bytes at p-7 are not all FORBIDDENBYTE (0xfd):
        at p-7: 0xdd *** OUCH
        at p-6: 0xdd *** OUCH
        at p-5: 0xdd *** OUCH
        at p-4: 0xdd *** OUCH
        at p-3: 0xdd *** OUCH
        at p-2: 0x00 *** OUCH
        at p-1: 0x00 *** OUCH
    Because memory is corrupted at the start, the count of bytes requested
       may be bogus, and checking the trailing pad bytes may segfault.
    The 8 pad bytes at tail=0xddde5677967c12fd are fish: Job 1, './python c.py' terminated by signal SIGSEGV (Address boundary error)

In current main branch, when a single-phase extension (_tkinter is an example) got imported, it's init function will be execute in the main interpreter, although the import statement is executed in the sub-interpreter.

And there is a workaround code to intern the filename object to the main interpreter's interned string dict:

cpython/Python/import.c

Lines 1970 to 1973 in c3b6dbf

// XXX There's a refleak somewhere with the filename.
// Until we can track it down, we intern it.
PyObject *filename = Py_NewRef(info->filename);
PyUnicode_InternInPlace(&filename);

I think the filename is created in the sub-interpreter in this test code, and it's allocated with the sub-interpreter's obmalloc state, which will be freed when the sub-interpreter is destroyed by statement _interpreters.destroy(iid) in the test code.

When the main interpreter shutdown, it will try to free the interned string dict, this will cause issues and crash the interpreter.

Using gdb to dump the call stack:

(gdb) bt
#0  0x00005555556c9414 in _PyObject_DebugDumpAddress (p=p@entry=0x7ffff731b520) at Objects/obmalloc.c:3003
#1  0x00005555556c9705 in _PyMem_DebugCheckAddress (func=func@entry=0x5555559106a0 <__func__.8> "_PyMem_DebugRawFree", api=114 'r', p=p@entry=0x7ffff731b520) at Objects/obmalloc.c:2921
#2  0x00005555556c97aa in _PyMem_DebugRawFree (ctx=0x555555afecb0 <_PyRuntime+560>, p=0x7ffff731b520) at Objects/obmalloc.c:2749
#3  0x00005555556ddee4 in PyMem_RawFree (ptr=ptr@entry=0x7ffff731b520) at Objects/obmalloc.c:963
#4  0x00005555556ddfe8 in _PyObject_Free (ctx=0x0, p=0x7ffff731b520) at Objects/obmalloc.c:2416
#5  0x00005555556c97eb in _PyMem_DebugRawFree (ctx=ctx@entry=0x555555afed10 <_PyRuntime+656>, p=p@entry=0x7ffff731b530) at Objects/obmalloc.c:2754
#6  0x00005555556c9b57 in _PyMem_DebugFree (ctx=0x555555afed10 <_PyRuntime+656>, ptr=0x7ffff731b530) at Objects/obmalloc.c:2891
#7  0x00005555556de3dc in PyObject_Free (ptr=<optimized out>) at Objects/obmalloc.c:1323
#8  0x00005555557107fb in unicode_dealloc (unicode=0x7ffff731b530) at Objects/unicodeobject.c:1608
#9  0x00005555556c4735 in _Py_Dealloc (op=op@entry=0x7ffff731b530) at Objects/object.c:2854
#10 0x00005555556ac8da in Py_DECREF (filename=filename@entry=0x55555589e242 "./Include/refcount.h", lineno=lineno@entry=459, op=0x7ffff731b530) at ./Include/refcount.h:351
#11 0x00005555556ac8f9 in Py_XDECREF (op=<optimized out>) at ./Include/refcount.h:459
#12 0x00005555556ac9ad in dictkeys_decref (interp=interp@entry=0x555555b17798 <_PyRuntime+101656>, dk=dk@entry=0x555555c51800, use_qsbr=use_qsbr@entry=false) at Objects/dictobject.c:492
#13 0x00005555556b489d in clear_lock_held (op=op@entry=0x7ffff7500050) at Objects/dictobject.c:2766
#14 0x00005555556b4977 in PyDict_Clear (op=op@entry=0x7ffff7500050) at Objects/dictobject.c:2790
#15 0x000055555571313b in clear_interned_dict (interp=interp@entry=0x555555b17798 <_PyRuntime+101656>) at Objects/unicodeobject.c:295
#16 0x000055555573c1a6 in _PyUnicode_ClearInterned (interp=interp@entry=0x555555b17798 <_PyRuntime+101656>) at Objects/unicodeobject.c:15098
#17 0x0000555555805f17 in finalize_interp_types (interp=0x555555b17798 <_PyRuntime+101656>) at Python/pylifecycle.c:1839
#18 0x0000555555805f98 in finalize_interp_clear (tstate=tstate@entry=0x555555b4efe8 <_PyRuntime+329064>) at Python/pylifecycle.c:1884
#19 0x0000555555806102 in Py_FinalizeEx () at Python/pylifecycle.c:2107
#20 0x0000555555832ecc in Py_RunMain () at Modules/main.c:720
#21 0x0000555555832f3e in pymain_main (args=args@entry=0x7fffffffe4e0) at Modules/main.c:748
#22 0x0000555555833004 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:772
#23 0x00005555555d4842 in main (argc=<optimized out>, argv=<optimized out>) at ./Programs/python.c:15

Please note that the #12 frame #12 0x00005555556ac9ad in dictkeys_decref (interp=interp@entry=0x555555b17798 <_PyRuntime+101656>, dk=dk@entry=0x555555c51800, use_qsbr=use_qsbr@entry=false) at Objects/dictobject.c:492, it's code is:

Py_XDECREF(entries[i].me_value);

We can't inspect the value here because it's memory got destroyed already. But adding these codes to L492:

                if (n == 3408) {  // 3408 is the size of the interned string dict, you may need to adjust it on your environment
                    PyObject_Print(entries[i].me_value, stderr, 0);
                    fprintf(stderr, "\n");
                }

We can see that when the interpreter crashed, it's calling _Py_Dealloc on filename string object:

...
'TkttType'
'typename'
'Tcl_Obj'
'/home/xxxxx/Codes/cpython/build/lib.linux-x86_64-3.14-pydebug/_tkinter.cpython-314d-x86_64-linux-gnu.so'  <- this

Clone the filename and re-create it on the main interpreter can avoid the crash.

@aisk aisk requested a review from kumaraditya303 as a code owner June 10, 2024 12:41
@aisk aisk marked this pull request as draft June 10, 2024 12:41
@aisk aisk changed the title Copy the filename into main interpreter before intern in import.c gh-115649: Copy the filename into main interpreter before intern in import.c Jun 10, 2024
@aisk aisk added needs backport to 3.13 bugs and security fixes skip news labels Jun 10, 2024
@aisk aisk marked this pull request as ready for review June 10, 2024 13:27
@aisk aisk requested a review from ericsnowcurrently June 10, 2024 13:27
@kumaraditya303
Copy link
Contributor

Can you add a test for it?

@aisk
Copy link
Member Author

aisk commented Jun 16, 2024

Can you add a test for it?

Thanks for the review! I added the test which will crash the interpreter before this pull request.

Python/import.c Outdated Show resolved Hide resolved
Python/import.c Outdated Show resolved Hide resolved
@kumaraditya303 kumaraditya303 merged commit 28140d1 into python:main Jun 17, 2024
35 of 36 checks passed
@miss-islington-app
Copy link

Thanks @aisk for the PR, and @kumaraditya303 for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jun 17, 2024
…n in import.c (pythonGH-120315)

(cherry picked from commit 28140d1)

Co-authored-by: AN Long <aisk@users.noreply.github.com>
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
@bedevere-app
Copy link

bedevere-app bot commented Jun 17, 2024

GH-120652 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Jun 17, 2024
kumaraditya303 added a commit that referenced this pull request Jun 17, 2024
…rn in import.c (GH-120315) (#120652)

gh-115649: Copy the filename into main interpreter before intern in import.c (GH-120315)
(cherry picked from commit 28140d1)

Co-authored-by: AN Long <aisk@users.noreply.github.com>
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
@aisk aisk deleted the copy-interned-string branch June 17, 2024 17:02
@ericsnowcurrently
Copy link
Member

FWIW, the fix looks good. Thanks for taking care of this, @aisk.

mrahtz pushed a commit to mrahtz/cpython that referenced this pull request Jun 30, 2024
…n in import.c (python#120315)

Co-authored-by: Kumar Aditya <kumaraditya@python.org>
noahbkim pushed a commit to hudson-trading/cpython that referenced this pull request Jul 11, 2024
…n in import.c (python#120315)

Co-authored-by: Kumar Aditya <kumaraditya@python.org>
estyxx pushed a commit to estyxx/cpython that referenced this pull request Jul 17, 2024
…n in import.c (python#120315)

Co-authored-by: Kumar Aditya <kumaraditya@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants