-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix big-endian bitmasks smaller than a byte #267
Conversation
Hihi, the assertion actually fails. :D Fun, fun. Is this a bug in LLVM or in portable-simd? |
...Ferris have mercy. |
Oh right, I think this follows different rules based on endianness. |
That would be a quirk only on masks less than 8 bits wide though. Output is the same across endianess for larger masks. Also, https://doc.rust-lang.org/nightly/core/simd/trait.ToBitMask.html#tymethod.to_bitmask does not have a ton of detail, but it says: "Each bit of the bitmask corresponds to a mask lane, starting with the LSB." |
So I think this here:
should probably additionally |
Yeah, a rotate or left shift would be appropriate (I wonder which would be faster or optimize better?). I never got around to it, but I did intend on providing a target-specific bitmask function as well, which has an unspecified (but consistent) ordering. |
iirc llvm doesn't actually specify bitmask layout if the length times the element size in bits isn't a multiple of 8:
so, imho we should use a swizzle to convert the vector to have a multiple of 8 lanes, then bitcast to bytes, rather than bitcasting then trying to fix up after the fact by shifting/rotating. |
Swizzle, then bitcast demo: Output assembly for example::to_bitmask:
movd xmm0, dword ptr [rdi]
pmovmskb eax, xmm0
ret |
rustc zero-extends the integer before storing it: https://github.com/rust-lang/rust/blob/461e8078010433ff7de2db2aaae8a3cfb0847215/compiler/rustc_codegen_llvm/src/intrinsic.rs#L1109 |
ah, so that means the problem is that we're bit reversing an |
Yep, more or less. |
Is the LLVM behavior with the reversed bitmask on big-endian documented somewhere? I couldn't find anything in the llvm docs. |
It'd almost be better if the bits would be first rotated and then zero-extended... but that might not be feasible. |
I originally found it by reading through the source for const vector bitcast, but later discovered it's in the llvm ir language reference:
|
Indeed a shift does it; I pushed that to this PR. |
Thank you! Everything looks in order here, I think? |
portable-simd: test bitmasks smaller than a byte Blocked on rust-lang/portable-simd#267 propagating to the [rustc repo](https://github.com/rust-lang/rust/tree/master/library/portable-simd)
portable-simd: test bitmasks smaller than a byte Blocked on rust-lang/portable-simd#267 propagating to the [rustc repo](https://github.com/rust-lang/rust/tree/master/library/portable-simd)
I don't actually know if
0b1000
is the expected value here on big-endian systems, but that seems more consistent with little-endian so I guess it is? Anyway, the fact that there is extra wiggle room here since the bitmask has to be padded to 8 bits means this seems like a case worth testing.