-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-96735: Fix undefined behaviour in struct unpacking functions #96739
gh-96735: Fix undefined behaviour in struct unpacking functions #96739
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
…behaviour' into fix-struct-unpack-undefined-behaviour
It's very tempting to convert all of these functions to use fixed-width integer types ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I tested this and indeed it fixes the UB.
Thanks for fixing the other ones too, just noticed that those were also affected.
I agree, it would be better but best left for 3.12 only. |
🤖 New build scheduled with the buildbot fleet by @mdickinson for commit 1294440 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
@kumaraditya303 I've updated this PR to be more efficient, while still avoiding undefined behaviour and implementation-defined behaviour (specifically, the conversion from an unsigned type to the corresponding signed type when the value being converted is not representable in the signed type; cf. C99 §6.3.1.3p3). The sign-extension is now branch free (and should compile to a no-op in the case that the C type being used has exactly the correct width), and the conversion from unsigned to signed should always compile to a no-op on any semi-reasonable compiler. Godbolt example for the case where sign extension is needed: https://godbolt.org/z/e8roKrn6r |
We are currently compiling all non-pydebug builds with I don't think we will turn off I'd say: just do the Right Thing for 3.12 in See also #96821 and #96678 (comment) As a minimal change for 3.11, I would suggest enabling That being said, I'm not opposed to this PR. It's a great start to getting everything safe for |
Agreed; I think I'll merge this for 3.12 only. |
Some not very rigorous timings, on macOS/Intel, non-optimised non-debug build. (This PR is not primarily about performance, but it would be unfortunate if it caused a significant performance regression.) On this branch:
On the main branch at commit 6281aff:
|
This PR fixes undefined behaviour in the struct module unpacking support functions
bu_longlong
,lu_longlong
,bu_int
andlu_int
; thanks to @kumaraditya303 for finding these.The fix is to accumulate the bytes in an unsigned integer type instead of a signed integer type, then to convert to the appropriate signed type. In cases where the width matches, that conversion will typically be compiled away to a no-op.
(Evidence from Godbolt: https://godbolt.org/z/5zvxodj64 .)
To make the conversions efficient, I've specialised the relevant functions for their output size: for
bu_longlong
andlu_longlong
, this only entails checking that the output size is indeed8
. Butbu_int
andlu_int
were used for format sizes2
and4
- I've split those into two separate functions each.No tests, because all of the affected cases are already exercised by the test suite.
struct.unpack
#96735