bpo-38748: gh-82929: Add test for stack corruption. #26204

michaelDCurran · 2021-05-18T00:34:01Z

bpo-38748 aka #82929: add expected fail test for stack corruption when executing x86 stdcall ctypes callback

On x86, When a ctypes callback is __stdcall (WINFUNCTYPE) and it takes an argument larger than 4 bytes (E.g. a long long or a VARIANT), with one or more arguments preceeding it such that this argument is not aligned on a multiple of 8 bytes, the stack seems to become corrupted, and the function does not return to the correct location. This causes either a crash or at very least an OSError is raised.
For example arguments can be:

long, long long
long, long, long, long long
But the corruption does not occur with something like:
long, long, long long

The test in this pr declairs a Python ctypes WINFUNCTYPE callback, with a return type of long, and arguments of long and long long.
The callback adds the two arguments together, prints the arguments and the result, and returns the result.
It then executes a cdecl c function which takes the callback, and two numbers as arguments. In turn, the c function executes the callback passing it the two numbers as arguments, returning the result of the callback.
The reason for making the outer c function cdecl and not stdcall was because cdecl seems to better handle inner stack corruption, raising OSError and seeming to leav the stack in a suitable state for future tests, rather than stdcall which just outright crashes.

When running the test on main, you can see via the printed message in the test, that the callback is executed once with the expected arguments, but then as it returns, it seems to be executed a second time with garbage arguments. Eventually then Python raises OSError. No doubt the callback is executed a second time as the return location just happens to line up with the start of the function.

this test is currently marked as an expected failiar as OSError is raised.

Manually backporting this patch to Python 3.7, the test passes.

https://bugs.python.org/issue38748

the-knights-who-say-ni · 2021-05-18T00:34:05Z

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed the PSF contributor agreement (CLA).

CLA Missing

Our records indicate the following people have not signed the CLA:

@michaelDCurran

For legal reasons we need all the people listed to sign the CLA before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

If you have recently signed the CLA, please wait at least one business day
before our records are updated.

You can check yourself to see if the CLA has been received.

Thanks again for the contribution, we look forward to reviewing it!

github-actions · 2021-06-21T00:06:35Z

This PR is stale because it has been open for 30 days with no activity.

feerrenrut · 2021-08-27T03:01:09Z

The failures in this test prevent our project from being able to upgrade Python. We are stuck on an unsupported version (3.7). @michaelDCurran has investigated and contributed the tests to reproduce the issue. Could someone (@zooba or @tiran) on the cPython project please merge this or provide feedback. Unfortunately we are quite blocked by this.

ambv · 2021-09-24T21:26:23Z

Closing and re-opening the PR to re-trigger CI.

ambv · 2021-09-25T09:18:16Z

On win64 the test doesn't actually fail:

test_i38748_stackCorruption (ctypes.test.test_callbacks.Callbacks) ... unexpected success

The status is misreported as success because re-running ctypes tests is currently broken somehow:

0:06:20 load avg: 4.04 Re-running test_ctypes in verbose mode

----------------------------------------------------------------------
Ran 0 tests in 0.000s

OK

I'll deal with fixing regrtest to properly re-run test_ctypes but we will need to only mark the test as "expected failure" on win32. On win32 the test fails, as Michael is pointing out:

Current thread 0x00001140 (most recent call first):
  File "D:\a\1\s\lib\ctypes\test\test_callbacks.py", line 160 in test_i38748_stackCorruption
  File "D:\a\1\s\lib\unittest\case.py", line 547 in _callTestMethod
  File "D:\a\1\s\lib\unittest\case.py", line 591 in run
  File "D:\a\1\s\lib\unittest\case.py", line 646 in __call__
  File "D:\a\1\s\lib\unittest\suite.py", line 122 in run
  File "D:\a\1\s\lib\unittest\suite.py", line 84 in __call__
  File "D:\a\1\s\lib\unittest\suite.py", line 122 in run
  File "D:\a\1\s\lib\unittest\suite.py", line 84 in __call__
  File "D:\a\1\s\lib\unittest\suite.py", line 122 in run
  File "D:\a\1\s\lib\unittest\suite.py", line 84 in __call__
  File "D:\a\1\s\lib\unittest\suite.py", line 122 in run
  File "D:\a\1\s\lib\unittest\suite.py", line 84 in __call__
  File "D:\a\1\s\lib\test\regrtest.py", line 43 in _main
  File "D:\a\1\s\lib\test\regrtest.py", line 47 in <module>
  File "D:\a\1\s\lib\runpy.py", line 86 in _run_code
  File "D:\a\1\s\lib\runpy.py", line 196 in _run_module_as_main

…y mark this test as an expected failure on 32 bit (x86).

michaelDCurran · 2021-11-11T06:25:14Z

@ambv I have now marked the test as expected failure only on 32 bit (x86).
Therefore, on win64 the test is now expected to run successfully, which it does.

MaxwellDupre

Looks ok to me.

LeonarddeR · 2022-05-24T05:53:38Z

Any chance this can be processed in the near future?

dpy013 · 2022-06-04T00:46:48Z

Will this pull request be merged this year?

bedevere-bot · 2022-06-06T23:43:00Z

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

zooba · 2022-06-06T23:43:08Z

I don't understand the urgency to get a failing test added, but I don't see why not.

Hopefully someone is submitting a fix to libffi so that we can adopt it in their next release?

Lib/ctypes/test/test_callbacks.py

codeofdusk · 2022-06-07T01:05:37Z

I don't understand the urgency to get a failing test added

The associated bug is blocking a major screen reader project from upgrading its Python interpreter. The inclusion of this test would help facilitate tracking and hopefully start progress toward resolution.

zooba · 2022-06-07T16:39:47Z

Okay, so someone is working on the libffi fix then, and this is just for tracking. Good.

Incidentally, the test now crashes in CI. It might need to be moved into the "crashers" tests, or run in a separate process so that the crash can be handled gracefully.

feerrenrut · 2022-06-08T09:07:28Z

Okay, so someone is working on the libffi fix then, and this is just for tracking. Good.

@zooba I don't think we know that. We figure the first step is to get this acknowledged and accepted as a bug affecting Python, hence this PR.

As stated already, this bug blocks the screen reader project (NVDA on GitHub) from being able to update past Python 3.7. This is a high priority for the project.

dpy013 · 2022-07-20T01:04:23Z

The way to get rid of python once and for all is to just drop the x86 version of python 3.x and go straight to the x64 version of python.
From what I've seen for so long, the python team doesn't have the idea of raising this pr to a higher review level.
So that nvaccess should turn passive to dimensional active.

atgreen · 2022-09-14T13:21:04Z

I'm the libffi author and just learned about this. I tried to replicate this in the libffi testsuite, here: https://github.com/libffi/libffi/blob/master/testsuite/libffi.call/bpo-38748.c
The test is passing on 32-bit x86 STDCALL ABI. Did I capture the test case properly? Which version of libffi are you using? And which compiler are you building with?

zooba · 2022-09-14T15:45:55Z

Assuming it's still current, it's libffi 3.4.2 (built from this mirror) and used whichever MSVC was the latest about a year ago. You can get our binaries from https://github.com/python/cpython-bin-deps/tree/libffi if you want to poke at them directly (no debug symbols, sorry).

zooba · 2022-09-14T15:56:19Z

I had a look at your test, and while I'm not 100% familiar with how you'd write this, I don't see anything there about stdcall? Maybe thats what ABI_ATTRandABI_NUM` mean, but I'm just not sure. The calling convention is an important piece here.

atgreen · 2022-09-14T18:22:49Z

Yes, that's right. See https://rl.gl/doc-text?id=RLGL-AKWR73CS
Search for bpo.
I was using gcc. Our CI testing only does 64-bit testing with msvc, so I guess we need to test with 32-bit as well.

zooba · 2022-09-14T20:48:20Z

Our CI testing only does 64-bit testing with msvc, so I guess we need to test with 32-bit as well.

FYI, though you probably know, MSVC only has a single calling convention on 64-bit and ignores the declarations. They only have an impact on 32-bit builds. So anything involving specific conventions won't show up unless you test 32-bit. (I believe ARM64 also only has one convention, but don't quote me on that... I also don't have the slightest idea how ARM64EC works on Windows...)

michaelDCurran · 2022-09-14T20:54:38Z

I've done some further investigation on this as well.
@atgreen hope this extra info may assist you:
It looks like that when libffi makes a function closure for the x86 stdcall calling convention, it incorrectly assumes that 8 byte parameters (E.g. int64, uint64, double) will be aligned on the stack to 8 bytes. This is not at all the case. Rather on x86 with msvc, these parameters are pushed onto the stack by the caller as two separate 4 byte values, and can happen on any 4 byte alignment. This assumption then means that libffi calculates that the caller will allocate more stack space than it actually would, and as the x86 stdcall calling convention requires the callee (in this case libffi) to clean up the stack on return, libffi therefore cleans up too much stack space, and therefore the stack is not in the state the caller left it.
In libffi 3.4.2, refer to line 185 of ffi_prep_cif_machdep in src/x86/ffi.c.

bytes = FFI_ALIGN (bytes, t->alignment);

If compiled on MSVC and it is x86 and stdcall, it probably should be:

bytes = FFI_ALIGN (bytes, FFI_SIZE_ARG);

Although the code has moved quite a bit since Python 3.7's fork of libffi, it is still clear why Python 3.7's fork of libffi did not have the bug. Refer to Python 3.7.9's ffi_prep_cif.c starting at line 157. It has an ifdef only allowing the alignment based on the argument's type if the compiler is not msvc or mingw32. It also has a very explicit comment:

/* Don't know if this is a libffi bug or not.  At least on 
   Windows with MSVC, function call parameters are *not*
   aligned in the same way as structure fields are, they are
   only aligned in integer boundaries.
   This doesn't do any harm for cdecl functions and closures,
   since the caller cleans up the stack, but it is wrong for
   stdcall functions where the callee cleans.
*/

A specific patch could be:

diff --git a/src/x86/ffi.c b/src/x86/ffi.c
index 24431c17..987e4eaf 100644
--- a/src/x86/ffi.c
+++ b/src/x86/ffi.c
@@ -181,8 +181,16 @@ ffi_prep_cif_machdep(ffi_cif *cif)
   for (i = 0, n = cif->nargs; i < n; i++)
     {
       ffi_type *t = cif->arg_types[i];
-
+      #if defined(_MSC_VER) && defined(_M_IX86)
+      if(cabi == FFI_STDCALL) 
+      {
+        bytes = FFI_ALIGN (bytes, FFI_SIZEOF_ARG);
+      } else {
+        bytes = FFI_ALIGN (bytes, t->alignment);
+      }
+      #else
       bytes = FFI_ALIGN (bytes, t->alignment);
+      #endif
       bytes += FFI_ALIGN (t->size, FFI_SIZEOF_ARG);
     }
   cif->bytes = bytes;

It is questionable as to the exact ifdef check, but this patch currently removes argument alignment on x86 (32 bit) only when compiled with msvc and only when the calling convention is stdcall.

Running the ctypes tests, including my testcase in #26204 (no longer marked as expected failure), all tests run without errors, including my test.

Running libffi's own testsuit also runs with no further errors than what were already occurring without my patch.

atgreen · 2022-09-19T22:14:58Z

I just released libffi 3.4.3, which should include a fix for this. Please try it out and let me know.

michaelDCurran · 2022-09-23T07:30:00Z

@atgreen Firstly thanks for looking into this, it is much appreciated.
However, I am having trouble building libffi 3.4.3 in order to verify tests in cpython on Windows.
If I checkout tag v3.4.3 from the libffi upstream repo, and follow the Windows/cygwin build instructions, I get a compilation error. Yet, v3.4.2 compiles fine.
The error is as follows:

libtool: compile:  /cygdrive/c/Users/mick/programming/git/cpython-source-deps/msvcc.sh -DHAVE_CONFIG_H -I. -I.. -I. -I../include -Iinclude -I../src -DFFI_BUILDING_DLL -I. -I../include -Iinclude -I../src -c ../src/x86/sysv_intel.S  -DDLL_EXPORT -DPIC -o src/x86/.libs/sysv_intel.obj
sysv_intel.S
 Assembling: src/x86/.libs/sysv_intel.asm
src/x86/.libs/sysv_intel.asm(1807) : error A2008:syntax error : *
src/x86/.libs/sysv_intel.asm(1808) : error A2008:syntax error : *
make[3]: *** [Makefile:1279: src/x86/sysv_intel.lo] Error 1

Seems like some kind of syntax error in sysv_intel.S?

I also get the exact same error if I create a local branch of libffi-3.4.3 in cpython-source-deps, and run cpython's pcbuild\prepair_libffi.bat -x86 in order to build libffi for cpython.
Again, v3.4.2 compiles okay.
Let me know if you require any extra info from me in order to debug this further.

zooba · 2022-09-23T12:40:59Z

Seems like some kind of syntax error in sysv_intel.S?

Yes, I see the same issue.

Let's take it to #96965 rather than this issue.

zooba · 2022-09-23T15:12:10Z

libffi has been updated in main (and soon, 3.11), so once this branch is updated and tests are run it should pass.

… the underlying issue in libffi has been addressed in libffi-3.4.3.

zooba · 2022-09-26T16:27:20Z

Perfect! Thanks for being persistent on this one.

the-knights-who-say-ni added the CLA not signed label May 18, 2021

bedevere-bot added the awaiting review label May 18, 2021

the-knights-who-say-ni added CLA signed and removed CLA not signed labels May 19, 2021

tirkarthi changed the title ~~Add test for stack corruption in issue 38748.~~ bpo-38748: Add test for stack corruption. May 21, 2021

github-actions bot added the stale Stale PR or inactive for long period of time. label Jun 21, 2021

feerrenrut mentioned this pull request Aug 27, 2021

Task: Update to latest (most appropriate) version of Python nvaccess/nvda#12064

Closed

ambv closed this Sep 24, 2021

ambv reopened this Sep 24, 2021

michaelDCurran added 2 commits November 11, 2021 16:07

Add test for stack corruption in issue 38748.

cd91426

ctypes.test.test_callbacks.Callbacks.test_i38748_stackCorruption: Onl…

07d66cb

…y mark this test as an expected failure on 32 bit (x86).

michaelDCurran force-pushed the test_i38748 branch from 9975f89 to 07d66cb Compare November 11, 2021 06:15

Add back newline

b1c6e53

MaxwellDupre approved these changes Mar 24, 2022

View reviewed changes

bedevere-bot added awaiting core review and removed awaiting review labels Mar 24, 2022

Merge branch 'main' into test_i38748

f0100eb

zooba added the skip news label Jun 6, 2022

zooba reviewed Jun 6, 2022

View reviewed changes

Lib/ctypes/test/test_callbacks.py Outdated Show resolved Hide resolved

Update Lib/ctypes/test/test_callbacks.py

800f0c0

AA-Turner removed CLA signed stale Stale PR or inactive for long period of time. labels Jun 8, 2022

dpy013 mentioned this pull request Sep 14, 2022

Feedback-bpo-38748: 32 bit ctypes stdcall callback fails to restore stack pointer libffi/libffi#727

Closed

zooba mentioned this pull request Sep 20, 2022

Update to libffi 3.4.3 #96965

Closed

michaelDCurran added 2 commits September 24, 2022 09:58

Merge branch 'main' into test_i38748

3525bbf

Ctypes callback test for 38748: no longer mark as expected failure as…

3f2fc41

… the underlying issue in libffi has been addressed in libffi-3.4.3.

zooba merged commit d79dd92 into python:main Sep 26, 2022

bedevere-bot removed the awaiting core review label Sep 26, 2022

DavidHeffernan mannequin mentioned this pull request Nov 27, 2022

32 bit ctypes stdcall callback fails to restore stack pointer #82929

Closed

gpshead changed the title ~~bpo-38748: Add test for stack corruption.~~ bpo-38748: gh-82929: Add test for stack corruption. Jan 26, 2023

ynkdir mentioned this pull request Jul 11, 2024

Run code error ynkdir/py-win32more#50

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpo-38748: gh-82929: Add test for stack corruption. #26204

bpo-38748: gh-82929: Add test for stack corruption. #26204

michaelDCurran commented May 18, 2021 •

edited by gpshead

Loading

the-knights-who-say-ni commented May 18, 2021

github-actions bot commented Jun 21, 2021

feerrenrut commented Aug 27, 2021

ambv commented Sep 24, 2021

ambv commented Sep 25, 2021

michaelDCurran commented Nov 11, 2021

MaxwellDupre left a comment

LeonarddeR commented May 24, 2022

dpy013 commented Jun 4, 2022

bedevere-bot commented Jun 6, 2022

zooba commented Jun 6, 2022

codeofdusk commented Jun 7, 2022

zooba commented Jun 7, 2022

feerrenrut commented Jun 8, 2022

dpy013 commented Jul 20, 2022

atgreen commented Sep 14, 2022

zooba commented Sep 14, 2022

zooba commented Sep 14, 2022

atgreen commented Sep 14, 2022

zooba commented Sep 14, 2022 •

edited

Loading

michaelDCurran commented Sep 14, 2022

atgreen commented Sep 19, 2022

michaelDCurran commented Sep 23, 2022

zooba commented Sep 23, 2022

zooba commented Sep 23, 2022

zooba commented Sep 26, 2022

bpo-38748: gh-82929: Add test for stack corruption. #26204

bpo-38748: gh-82929: Add test for stack corruption. #26204

Conversation

michaelDCurran commented May 18, 2021 • edited by gpshead Loading

bpo-38748 aka #82929: add expected fail test for stack corruption when executing x86 stdcall ctypes callback

the-knights-who-say-ni commented May 18, 2021

CLA Missing

github-actions bot commented Jun 21, 2021

feerrenrut commented Aug 27, 2021

ambv commented Sep 24, 2021

ambv commented Sep 25, 2021

michaelDCurran commented Nov 11, 2021

MaxwellDupre left a comment

Choose a reason for hiding this comment

LeonarddeR commented May 24, 2022

dpy013 commented Jun 4, 2022

bedevere-bot commented Jun 6, 2022

zooba commented Jun 6, 2022

codeofdusk commented Jun 7, 2022

zooba commented Jun 7, 2022

feerrenrut commented Jun 8, 2022

dpy013 commented Jul 20, 2022

atgreen commented Sep 14, 2022

zooba commented Sep 14, 2022

zooba commented Sep 14, 2022

atgreen commented Sep 14, 2022

zooba commented Sep 14, 2022 • edited Loading

michaelDCurran commented Sep 14, 2022

atgreen commented Sep 19, 2022

michaelDCurran commented Sep 23, 2022

zooba commented Sep 23, 2022

zooba commented Sep 23, 2022

zooba commented Sep 26, 2022

michaelDCurran commented May 18, 2021 •

edited by gpshead

Loading

zooba commented Sep 14, 2022 •

edited

Loading