-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unaligned loads/stores on x86 #21
Comments
I'm not convinced that the unaligned loads and stores are invalid here. Usually these I don't know of any x86 compilers that would use aligned SSE loads and stores for such code. This would probably break half of the existing code in existence. This aside, I added If we're fixing a UBsan issue, I'd prefer that it be reproduced first. |
This is not "undefined behaviour" but "unspecified behaviour" (not as bad as undefined behaviour but still bad enough to fix it). By C++ 11 Std 5.2.10 [expr.reinterpret.cast] paragraph 7, reinterpret_cast<T*> equals static_cast<T*> of static_cast<void*>. By 5.2.9 [expr.static.cast] paragraph 13, the static cast from void* to T* works, "If [...] A satisfies the alignment requeriment of T" where A is the address the void* points to. Otherwise "The result of any other such pointer conversion is unpecified". Therfore, you can't be sure the result of the expression gives you the value you expect, but at least you know it gives you any value and the rest of your programm will still work (can't get rationalized away because of undefined behaviour). However, since you don't know what address exactly gets accessed it could yield SIGSEG and the memory sanitizer could fail (that might have happened to the OP). If the value would have been "implementation defined" you could use the code macro-guarded for different compilers that define the desired behaviour (in opposite to different platforms). Even despite it's "unspecified behaviour" still some compiler could define a behaviour and you still could use the compiler-guard for them. Note that recent g++ and clang produce optimal machine code (unaligned mov + possibly bswap) in -O2 (clang from -Og) for code like uint32_t load_big_uint32(uint8_t const * x) { when you read 2 (clang only), 4 or 8 bytes. See: https://godbolt.org/g/HehhCs |
I know this is closed already but https://www.boost.org/users/history/version_1_67_0.html does not mention these changes although they actually do fix undefined behavior (it actually is undefined, not just unspecified) and can and does indeed lead to crashes.
I came across such an example in https://stackoverflow.com/a/46790815/1930508 and created a reproducer on godbolt using boost::endian: https://godbolt.org/z/GDipuT The example is very straight-forward: Simply sum all entries of an unaligned array of boost endian buffers and it will crash. So I'd give this some more exposure. |
while validating my code in UBsan (-fsanitize=undefined) I learned that the endian buffers perform unaligned loads and stores on x86-family of machines.
There are preprocessor checks here and a few other places.
I suggest these checks be removed, and the code honor the C++ abstract machine's requirements.
In my mind they are problematic because:
Furthermore, I would expect the compiler to be able to identify a 4-byte or 8-byte
memcpy()
as a potentially unaligned store/load, and generate amov
(which the compiler knows allows unaligned access), but not to generate SSE instructions.I'm proposing that these special cases be removed
[1]. my x86 compiler is free to generate SSE instructions that do require aligned access
[2]. Imagine storing a value through a pointer (that the standard requires be aligned) and later check the pointer to see whether or not its last few bits are zero. The compiler may remove the check since it must be aligned, otherwise you invoked UB anyway.
The text was updated successfully, but these errors were encountered: