-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable -fstrict-overflow
#96821
Comments
This is definitely worth exploring! |
What do you mean by that? More concretely:
|
Also, do clang and msvc have something similar? |
In gh-96823 I asked about detecting compatibility for the compile flag. This is probably the better place to discuss that. Basically, how can we detect that a C file can be built with |
I ran clang's undefined behaviour sanitiser and address sanitiser over the test suite to detect the modules that need defined overflow. Of course, strictly speaking a detection this way can only show the presence of the need for defined overflow, but it cannot show the absence of such a need. But that's no different from any other bug or undefined behaviour, and we just rely on the additional assumption that our test suite has enough coverage. Small complicating detail: when you use the conpiler flag to make signed integer overflow into defined behaviour, the sanitisers no longer complain about it. So if you want to check if the flag is still necessary for a module, you need to temporarily turn it off. |
Clang definitely has this flag. We actually use clang for the sanitisers. I don't know about msvc. There's no trace of anything like this in their docs https://docs.microsoft.com/en-us/cpp/build/reference/compiler-options-listed-by-category?view=msvc-170#code-generation But we can just stick to the current situation here: Only gcc and clang explicitly get flags about overflow, and for the other compilers we just hope and pray. (The situation will actually improve for the other compilers, when we make all modules strict-overflow safe.) For the actual implementation, I'll probably make |
I had a look at PEP 7. The relevant section is https://peps.python.org/pep-0007/#c-dialect The PEP says to use C11 and doesn't mention any deviation from that for overflow. So our code should already avoid signed integer overflow. At this point that's more of an aspiration than reality, though. So to some extent, we could describe the effort here as part of implementing PEP 7. Strictly speaking we don't need to change anything, but I think pragmatically we should make a note that up to 3.11 signed integer overflow was tacitly tolerated, but that beginning from 3.12 we are sticking to the standard. I created a draft PR python/peps#2796 for how that could look like. (Please suggest better wording, if you have ideas.) (Additionally, I also notice that when I am building I am getting a few warnings here or there, but the PEP suggests that we shouldn't be getting any warnings. Perhaps I'll spend a bit of time fixing the code (or suppressing these warnings in known instances that we can't or won't fix.)) |
Another thing: whatever policy we decide on, we should probably fix that |
I just found some slightly bad news in the GCC docs:
If I understand that right, specifying So we might need to be completely clean before we can expect to get a performance benefit? That's where we want to get to eventually anyway, but it would have been nice if a piecemeal approach worked. Update: in experiments with clang and its undefined-behaviour sanitizer, the piecemeal approach seems to work. So at least for diagnostics it's good enough! I don't know about optimizations. |
I extended #96823 to incorporate what I wrote above. |
Left-shifting negative numbers is undefined behaviour. Fortunately, multiplication works just as well, is defined behaviour, and gets compiled to the same machine code as before by optimizing compilers.
…nGH-96915) * pythongh-96821: Assert for demonstrating undefined behaviour * Fix UB Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM> (cherry picked from commit cbdeda8) Co-authored-by: Matthias Görgens <matthias.goergens@gmail.com>
…nGH-96915) * pythongh-96821: Assert for demonstrating undefined behaviour * Fix UB Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM> (cherry picked from commit cbdeda8) Co-authored-by: Matthias Görgens <matthias.goergens@gmail.com>
* gh-96821: Assert for demonstrating undefined behaviour * Fix UB Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
@mdickinson Could you please have a look at the other PRs mentioned in this issue? There's only two left, and then we can turn on strict overflow for (hopefully) extra performance. I went with implementation defined behaviour for those two PRs. But I'm happy to adopt your clever method for non-implementation defined behaviour, if you prefer that. |
I ran pyperformance benchmarks on a branch that had all the patches applied and
Benchmark hidden because not significant (16): scimark_sparse_mat_mult, regex_compile, sqlalchemy_imperative, html5lib, unpickle_list, pickle, chameleon, pickle_list, hexiom, genshi_xml, 2to3, richards, sympy_str, pickle_dict, unpack_sequence, sqlite_synth |
* gh-96821: Fix undefined behaviour in `audioop.c` Left-shifting negative numbers is undefined behaviour. Fortunately, multiplication works just as well, is defined behaviour, and gets compiled to the same machine code as before by optimizing compilers. Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
* pythongh-96821: Fix undefined behaviour in `audioop.c` Left-shifting negative numbers is undefined behaviour. Fortunately, multiplication works just as well, is defined behaviour, and gets compiled to the same machine code as before by optimizing compilers. Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM> Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com> Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com> Co-authored-by: Shantanu <hauntsaninja@gmail.com>
I've fixed and merged #96823 . A follow up improvement could be to check for the compiler switch only if |
* main: (21 commits) pythongh-102192: Replace PyErr_Fetch/Restore etc by more efficient alternatives in sub interpreters module (python#102472) pythongh-95672: Fix versionadded indentation of get_pagesize in test.rst (pythongh-102455) pythongh-102416: Do not memoize incorrectly loop rules in the parser (python#102467) pythonGH-101362: Optimise PurePath(PurePath(...)) (pythonGH-101667) pythonGH-101362: Check pathlib.Path flavour compatibility at import time (pythonGH-101664) pythonGH-101362: Call join() only when >1 argument supplied to pathlib.PurePath() (python#101665) pythongh-102444: Fix minor bugs in `test_typing` highlighted by pyflakes (python#102445) pythonGH-102341: Improve the test function for pow (python#102342) Fix unused classes in a typing test (pythonGH-102437) pythongh-101979: argparse: fix a bug where parentheses in metavar argument of add_argument() were dropped (python#102318) pythongh-102356: Add thrashcan macros to filter object dealloc (python#102426) Move around example in to_bytes() to avoid confusion (python#101595) pythonGH-97546: fix flaky asyncio `test_wait_for_race_condition` test (python#102421) pythongh-96821: Add config option `--with-strict-overflow` (python#96823) pythongh-101992: update pstlib module documentation (python#102133) pythongh-63301: Set exit code when tabnanny CLI exits on error (python#7699) pythongh-101863: Fix wrong comments in EUC-KR codec (pythongh-102417) pythongh-102302 Micro-optimize `inspect.Parameter.__hash__` (python#102303) pythongh-102179: Fix `os.dup2` error reporting for negative fds (python#102180) pythongh-101892: Fix `SystemError` when a callable iterator call exhausts the iterator (python#101896) ...
* main: (37 commits) pythongh-102192: Replace PyErr_Fetch/Restore etc by more efficient alternatives in sub interpreters module (python#102472) pythongh-95672: Fix versionadded indentation of get_pagesize in test.rst (pythongh-102455) pythongh-102416: Do not memoize incorrectly loop rules in the parser (python#102467) pythonGH-101362: Optimise PurePath(PurePath(...)) (pythonGH-101667) pythonGH-101362: Check pathlib.Path flavour compatibility at import time (pythonGH-101664) pythonGH-101362: Call join() only when >1 argument supplied to pathlib.PurePath() (python#101665) pythongh-102444: Fix minor bugs in `test_typing` highlighted by pyflakes (python#102445) pythonGH-102341: Improve the test function for pow (python#102342) Fix unused classes in a typing test (pythonGH-102437) pythongh-101979: argparse: fix a bug where parentheses in metavar argument of add_argument() were dropped (python#102318) pythongh-102356: Add thrashcan macros to filter object dealloc (python#102426) Move around example in to_bytes() to avoid confusion (python#101595) pythonGH-97546: fix flaky asyncio `test_wait_for_race_condition` test (python#102421) pythongh-96821: Add config option `--with-strict-overflow` (python#96823) pythongh-101992: update pstlib module documentation (python#102133) pythongh-63301: Set exit code when tabnanny CLI exits on error (python#7699) pythongh-101863: Fix wrong comments in EUC-KR codec (pythongh-102417) pythongh-102302 Micro-optimize `inspect.Parameter.__hash__` (python#102303) pythongh-102179: Fix `os.dup2` error reporting for negative fds (python#102180) pythongh-101892: Fix `SystemError` when a callable iterator call exhausts the iterator (python#101896) ...
At the moment we compile releases with
-fwrapv
which makes the code a bit safer, but disables certain optimizations. From the GCC docs:My experiments with running sanitisers seem to suggest that we are nearly already ready for
-fno-wrapv
(or-fstrict-overflow
in general). Doing so could lead to quite a few speedups, but we would need to be more careful with the code we write.It might be worthwhile to get a few benchmarks.
(To be extra precise, we give
-fwrapv
for clang and gcc for any build that doesn't get--with-pydebug
.)Pitch
My plan right now is to adapt the build system so that only the modules that need it are build with
-fwrapv
, and the rest can be build with-fstrict-overflow
.We already have config machinery that can add specific
CFLAGS
for specific modules only.Perhaps the whole thing can be gated behind a configure flag, like
--with-strict-overflow
.If everything goes well, and this improves performance we can consider adding this functionality to one of the standard optimization options.
We can also work on making more modules
-fstrict-overflow
safe.Previous discussion
@markshannon @ericsnowcurrently
Brought up on faster-cpython/ideas#458 and inspired by #96678
Some previous issues around
-fwrapv
:I'm sure there are more.
Progress so far
As far as is currently known, the three remaining modules that rely on defined integer overflow are fixed by:
_struct
: gh-96735: Fix undefined behaviour in struct unpacking functions #96739audioop
: gh-96821: Fix undefined behaviour inaudioop.c
#96923_ctypes
: gh-96821: Fix undefined behaviour in_ctypes/cfield.c
#96925Linked PRs
--with-strict-overflow
#96823The text was updated successfully, but these errors were encountered: