-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve libcall relocations for older CPUs #5567
Conversation
Long ago Wasmtime used to have logic for resolving relocations post-compilation for libcalls which I ended up removing during refactorings last year. As bytecodealliance#5563 points out, however, it's possible to get Wasmtime to panic by disabling SSE features which forces Cranelift to use libcalls for some floating-point operations instead. Note that this also requires disabling SIMD because SIMD support has a baseline of SSE 4.2. This commit pulls back the old implementations of various libcalls and reimplements logic necessary to have them work on CPUs without SSE 4.2 Closes bytecodealliance#5563
I primarily tested this through:
to find all instructions that need libcall lowerings. Cranelift still has more libcalls but it seems that we don't end up relying on them, so I've left various panics in place to notify us if libcalls pop up and they're not handled. Otherwise the fuzzing has run for awhile now and hasn't found more instructions so this is likely at least a good set to start with. |
Subscribe to Label Actioncc @fitzgen
This issue or pull request has been labeled: "fuzzing"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that this looks like it has no runtime cost except when it's actually needed. I have a couple questions but overall this looks like the right plan to me!
let sse3 = u.arbitrary::<bool>()?; | ||
let ssse3 = u.arbitrary::<bool>()?; | ||
let sse4_1 = u.arbitrary::<bool>()?; | ||
let sse4_2 = u.arbitrary::<bool>()?; | ||
|
||
for (name, val) in flags { | ||
match name.as_str() { | ||
"has_sse3" => *val = sse3.to_string(), | ||
"has_ssse3" => *val = ssse3.to_string(), | ||
"has_sse41" => *val = sse4_1.to_string(), | ||
"has_sse42" => *val = sse4_2.to_string(), | ||
_ => {} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even after reading the above comment several times I still had a hard time figuring out that these calls to arbitrary()
can't be moved into the loop.
How about something like this to avoid having to keep two lists of feature names in sync, and hopefully make it less tempting to inline the variables into the loop?
let sse3 = u.arbitrary::<bool>()?; | |
let ssse3 = u.arbitrary::<bool>()?; | |
let sse4_1 = u.arbitrary::<bool>()?; | |
let sse4_2 = u.arbitrary::<bool>()?; | |
for (name, val) in flags { | |
match name.as_str() { | |
"has_sse3" => *val = sse3.to_string(), | |
"has_ssse3" => *val = ssse3.to_string(), | |
"has_sse41" => *val = sse4_1.to_string(), | |
"has_sse42" => *val = sse4_2.to_string(), | |
_ => {} | |
} | |
} | |
let new_flags = ["has_sse3", "has_ssse3", "has_sse41", "has_sse42"] | |
.into_iter() | |
.map(|name| (name, u.arbitrary()?)) | |
.collect::<arbitrary::Result<HashMap<&'static str, bool>>()?; | |
for (name, val) in flags { | |
if let Some(new_value) = new_flags.get(name) { | |
*val = new_value.to_string(); | |
} | |
} |
Also, I think this is okay, but just to check: Each of these features implies the previous one (e.g. if you have SSE4.2 you also have SSE4.1). Is it okay to turn some off without turning off later ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it okay to turn some off without turning off later ones?
While I'm not 100% certain myself I believe that this question is up to Cranelift mostly. I believe Cranelift is mostly structured around "if this feature is active emit this instr" while it doesn't make sense to disable sse3 but leave sse4.1 enabled I don't think it will break anything since it would be a sort of "chaos mode" for cranelift stressing it.
If this becomes a problem for Cranelift, however, we can tweak the fuzz input to respect what the actual CPU feature hierarchy is.
const TOINT_32: f32 = 1.0 / f32::EPSILON; | ||
const TOINT_64: f64 = 1.0 / f64::EPSILON; | ||
|
||
pub extern "C" fn nearestf32(x: f32) -> f32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice if there was a good way to share these implementations of round_ties_even
with the ones in cranelift/codegen/src/ir/immediates.rs
. None come to my mind off-hand since the runtime can be built in a configuration without the compiler, but it'd be nice.
The existing version there doesn't have the "Check for NaNs" section that this version does. I haven't thought hard enough about this to understand what that's for, but maybe you could add a comment linking to some reference for why this is the right implementation?
The existing version does have a link to rust-lang/rust#96710 which I think is worth copying here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These libcalls were removed in #4470 and I copied their source from just before that commit.
Apparently nearest
has a slightly storied history:
- Fix remaining failures and re-enable x64 new backend CI #2219 fixed it for NaN cases
- runtime: new implementations for nearest lib calls #2171 made the implementation more efficient based on musl
- Implement a wast test harness #34 was the original implementation
I ran cargo run wast ./tests/spec_testsuite/f32.wast --cranelift-set has_sse41=false --wasm-features=-simd
and commented out the "Check for NaNs" block and the tests failed, so the block is definitely load-bearing. That may mean that Cranelift's implementation needs to be updated to account for NaN's perhaps? (I don't really understand the actual underlying code here I'm just copying bytes)
Be sure to factor in the offset of the function itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my questions! Still looks good to me. 👍
A fuzz bug came in last night from bytecodealliance#5567 where spectest fuzzing will first generate a config, possibly with SSE features for SIMD disabled, only to have SIMD later enabled by `set_spectest_compliant`. This commit fixes the issue by changing to `is_spectest_compliant` as a query and throwing out the fuzz case if it isn't. This means that the spectest fuzzer will throw out more inputs but means we can continue to generate interesting configs and such for other inputs.
A fuzz bug came in last night from #5567 where spectest fuzzing will first generate a config, possibly with SSE features for SIMD disabled, only to have SIMD later enabled by `set_spectest_compliant`. This commit fixes the issue by changing to `is_spectest_compliant` as a query and throwing out the fuzz case if it isn't. This means that the spectest fuzzer will throw out more inputs but means we can continue to generate interesting configs and such for other inputs.
Long ago Wasmtime used to have logic for resolving relocations post-compilation for libcalls which I ended up removing during refactorings last year. As #5563 points out, however, it's possible to get Wasmtime to panic by disabling SSE features which forces Cranelift to use libcalls for some floating-point operations instead. Note that this also requires disabling SIMD because SIMD support has a baseline of SSE 4.2.
This commit pulls back the old implementations of various libcalls and reimplements logic necessary to have them work on CPUs without SSE 4.2
Closes #5563