-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-38748: gh-82929: Add test for stack corruption. #26204
Conversation
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed the PSF contributor agreement (CLA). CLA MissingOur records indicate the following people have not signed the CLA: For legal reasons we need all the people listed to sign the CLA before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. If you have recently signed the CLA, please wait at least one business day You can check yourself to see if the CLA has been received. Thanks again for the contribution, we look forward to reviewing it! |
This PR is stale because it has been open for 30 days with no activity. |
The failures in this test prevent our project from being able to upgrade Python. We are stuck on an unsupported version (3.7). @michaelDCurran has investigated and contributed the tests to reproduce the issue. Could someone (@zooba or @tiran) on the cPython project please merge this or provide feedback. Unfortunately we are quite blocked by this. |
Closing and re-opening the PR to re-trigger CI. |
On win64 the test doesn't actually fail:
The status is misreported as success because re-running ctypes tests is currently broken somehow:
I'll deal with fixing regrtest to properly re-run test_ctypes but we will need to only mark the test as "expected failure" on win32. On win32 the test fails, as Michael is pointing out:
|
…y mark this test as an expected failure on 32 bit (x86).
9975f89
to
07d66cb
Compare
@ambv I have now marked the test as expected failure only on 32 bit (x86). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ok to me.
Any chance this can be processed in the near future? |
Will this pull request be merged this year? |
Most changes to Python require a NEWS entry. Please add it using the blurb_it web app or the blurb command-line tool. |
I don't understand the urgency to get a failing test added, but I don't see why not. Hopefully someone is submitting a fix to libffi so that we can adopt it in their next release? |
The associated bug is blocking a major screen reader project from upgrading its Python interpreter. The inclusion of this test would help facilitate tracking and hopefully start progress toward resolution. |
Okay, so someone is working on the libffi fix then, and this is just for tracking. Good. Incidentally, the test now crashes in CI. It might need to be moved into the "crashers" tests, or run in a separate process so that the crash can be handled gracefully. |
@zooba I don't think we know that. We figure the first step is to get this acknowledged and accepted as a bug affecting Python, hence this PR. As stated already, this bug blocks the screen reader project (NVDA on GitHub) from being able to update past Python 3.7. This is a high priority for the project. |
The way to get rid of python once and for all is to just drop the x86 version of python 3.x and go straight to the x64 version of python. |
I'm the libffi author and just learned about this. I tried to replicate this in the libffi testsuite, here: https://github.com/libffi/libffi/blob/master/testsuite/libffi.call/bpo-38748.c |
Assuming it's still current, it's libffi 3.4.2 (built from this mirror) and used whichever MSVC was the latest about a year ago. You can get our binaries from https://github.com/python/cpython-bin-deps/tree/libffi if you want to poke at them directly (no debug symbols, sorry). |
I had a look at your test, and while I'm not 100% familiar with how you'd write this, I don't see anything there about |
Yes, that's right. See https://rl.gl/doc-text?id=RLGL-AKWR73CS |
FYI, though you probably know, MSVC only has a single calling convention on 64-bit and ignores the declarations. They only have an impact on 32-bit builds. So anything involving specific conventions won't show up unless you test 32-bit. (I believe ARM64 also only has one convention, but don't quote me on that... I also don't have the slightest idea how ARM64EC works on Windows...) |
I've done some further investigation on this as well.
If compiled on MSVC and it is x86 and stdcall, it probably should be:
Although the code has moved quite a bit since Python 3.7's fork of libffi, it is still clear why Python 3.7's fork of libffi did not have the bug. Refer to Python 3.7.9's ffi_prep_cif.c starting at line 157. It has an ifdef only allowing the alignment based on the argument's type if the compiler is not msvc or mingw32. It also has a very explicit comment:
A specific patch could be: diff --git a/src/x86/ffi.c b/src/x86/ffi.c
index 24431c17..987e4eaf 100644
--- a/src/x86/ffi.c
+++ b/src/x86/ffi.c
@@ -181,8 +181,16 @@ ffi_prep_cif_machdep(ffi_cif *cif)
for (i = 0, n = cif->nargs; i < n; i++)
{
ffi_type *t = cif->arg_types[i];
-
+ #if defined(_MSC_VER) && defined(_M_IX86)
+ if(cabi == FFI_STDCALL)
+ {
+ bytes = FFI_ALIGN (bytes, FFI_SIZEOF_ARG);
+ } else {
+ bytes = FFI_ALIGN (bytes, t->alignment);
+ }
+ #else
bytes = FFI_ALIGN (bytes, t->alignment);
+ #endif
bytes += FFI_ALIGN (t->size, FFI_SIZEOF_ARG);
}
cif->bytes = bytes; It is questionable as to the exact ifdef check, but this patch currently removes argument alignment on x86 (32 bit) only when compiled with msvc and only when the calling convention is stdcall. Running the ctypes tests, including my testcase in #26204 (no longer marked as expected failure), all tests run without errors, including my test. Running libffi's own testsuit also runs with no further errors than what were already occurring without my patch. |
I just released libffi 3.4.3, which should include a fix for this. Please try it out and let me know. |
@atgreen Firstly thanks for looking into this, it is much appreciated.
Seems like some kind of syntax error in sysv_intel.S? I also get the exact same error if I create a local branch of libffi-3.4.3 in cpython-source-deps, and run cpython's |
Yes, I see the same issue. Let's take it to #96965 rather than this issue. |
libffi has been updated in main (and soon, 3.11), so once this branch is updated and tests are run it should pass. |
… the underlying issue in libffi has been addressed in libffi-3.4.3.
Perfect! Thanks for being persistent on this one. |
bpo-38748 aka #82929: add expected fail test for stack corruption when executing x86 stdcall ctypes callback
On x86, When a ctypes callback is __stdcall (WINFUNCTYPE) and it takes an argument larger than 4 bytes (E.g. a long long or a VARIANT), with one or more arguments preceeding it such that this argument is not aligned on a multiple of 8 bytes, the stack seems to become corrupted, and the function does not return to the correct location. This causes either a crash or at very least an OSError is raised.
For example arguments can be:
But the corruption does not occur with something like:
The test in this pr declairs a Python ctypes WINFUNCTYPE callback, with a return type of long, and arguments of long and long long.
The callback adds the two arguments together, prints the arguments and the result, and returns the result.
It then executes a cdecl c function which takes the callback, and two numbers as arguments. In turn, the c function executes the callback passing it the two numbers as arguments, returning the result of the callback.
The reason for making the outer c function cdecl and not stdcall was because cdecl seems to better handle inner stack corruption, raising OSError and seeming to leav the stack in a suitable state for future tests, rather than stdcall which just outright crashes.
When running the test on main, you can see via the printed message in the test, that the callback is executed once with the expected arguments, but then as it returns, it seems to be executed a second time with garbage arguments. Eventually then Python raises OSError. No doubt the callback is executed a second time as the return location just happens to line up with the start of the function.
this test is currently marked as an expected failiar as OSError is raised.
Manually backporting this patch to Python 3.7, the test passes.
https://bugs.python.org/issue38748