Resolve libcall relocations for older CPUs #5567

alexcrichton · 2023-01-12T23:20:03Z

Long ago Wasmtime used to have logic for resolving relocations post-compilation for libcalls which I ended up removing during refactorings last year. As #5563 points out, however, it's possible to get Wasmtime to panic by disabling SSE features which forces Cranelift to use libcalls for some floating-point operations instead. Note that this also requires disabling SIMD because SIMD support has a baseline of SSE 4.2.

This commit pulls back the old implementations of various libcalls and reimplements logic necessary to have them work on CPUs without SSE 4.2

Closes #5563

Long ago Wasmtime used to have logic for resolving relocations post-compilation for libcalls which I ended up removing during refactorings last year. As bytecodealliance#5563 points out, however, it's possible to get Wasmtime to panic by disabling SSE features which forces Cranelift to use libcalls for some floating-point operations instead. Note that this also requires disabling SIMD because SIMD support has a baseline of SSE 4.2. This commit pulls back the old implementations of various libcalls and reimplements logic necessary to have them work on CPUs without SSE 4.2 Closes bytecodealliance#5563

alexcrichton · 2023-01-12T23:21:16Z

I primarily tested this through:

ALLOWED_ENGINES=wasmi ALLOWED_MODULES=single-inst cargo +nightly fuzz run -s none differential --no-default-features

to find all instructions that need libcall lowerings. Cranelift still has more libcalls but it seems that we don't end up relying on them, so I've left various panics in place to notify us if libcalls pop up and they're not handled. Otherwise the fuzzing has run for awhile now and hasn't found more instructions so this is likely at least a good set to start with.

github-actions · 2023-01-12T23:23:03Z

Subscribe to Label Action

cc @fitzgen

This issue or pull request has been labeled: "fuzzing"

Thus the following users have been cc'd because of the following labels:

fitzgen: fuzzing

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

jameysharp

I like that this looks like it has no runtime cost except when it's actually needed. I have a couple questions but overall this looks like the right plan to me!

crates/environ/src/obj.rs

jameysharp · 2023-01-13T19:59:05Z

crates/fuzzing/src/generators/codegen_settings.rs

+            let sse3 = u.arbitrary::<bool>()?;
+            let ssse3 = u.arbitrary::<bool>()?;
+            let sse4_1 = u.arbitrary::<bool>()?;
+            let sse4_2 = u.arbitrary::<bool>()?;
+
+            for (name, val) in flags {
+                match name.as_str() {
+                    "has_sse3" => *val = sse3.to_string(),
+                    "has_ssse3" => *val = ssse3.to_string(),
+                    "has_sse41" => *val = sse4_1.to_string(),
+                    "has_sse42" => *val = sse4_2.to_string(),
+                    _ => {}
+                }
+            }


Even after reading the above comment several times I still had a hard time figuring out that these calls to arbitrary() can't be moved into the loop.

How about something like this to avoid having to keep two lists of feature names in sync, and hopefully make it less tempting to inline the variables into the loop?

Suggested change

let sse3 = u.arbitrary::<bool>()?;

let ssse3 = u.arbitrary::<bool>()?;

let sse4_1 = u.arbitrary::<bool>()?;

let sse4_2 = u.arbitrary::<bool>()?;

for (name, val) in flags {

match name.as_str() {

"has_sse3" => *val = sse3.to_string(),

"has_ssse3" => *val = ssse3.to_string(),

"has_sse41" => *val = sse4_1.to_string(),

"has_sse42" => *val = sse4_2.to_string(),

_ => {}

}

}

let new_flags = ["has_sse3", "has_ssse3", "has_sse41", "has_sse42"]

.into_iter()

.map(|name| (name, u.arbitrary()?))

.collect::<arbitrary::Result<HashMap<&'static str, bool>>()?;

for (name, val) in flags {

if let Some(new_value) = new_flags.get(name) {

*val = new_value.to_string();

}

}

Also, I think this is okay, but just to check: Each of these features implies the previous one (e.g. if you have SSE4.2 you also have SSE4.1). Is it okay to turn some off without turning off later ones?

Is it okay to turn some off without turning off later ones?

While I'm not 100% certain myself I believe that this question is up to Cranelift mostly. I believe Cranelift is mostly structured around "if this feature is active emit this instr" while it doesn't make sense to disable sse3 but leave sse4.1 enabled I don't think it will break anything since it would be a sort of "chaos mode" for cranelift stressing it.

If this becomes a problem for Cranelift, however, we can tweak the fuzz input to respect what the actual CPU feature hierarchy is.

jameysharp · 2023-01-13T20:15:48Z

crates/runtime/src/libcalls.rs

+    const TOINT_32: f32 = 1.0 / f32::EPSILON;
+    const TOINT_64: f64 = 1.0 / f64::EPSILON;
+
+    pub extern "C" fn nearestf32(x: f32) -> f32 {


It'd be nice if there was a good way to share these implementations of round_ties_even with the ones in cranelift/codegen/src/ir/immediates.rs. None come to my mind off-hand since the runtime can be built in a configuration without the compiler, but it'd be nice.

The existing version there doesn't have the "Check for NaNs" section that this version does. I haven't thought hard enough about this to understand what that's for, but maybe you could add a comment linking to some reference for why this is the right implementation?

The existing version does have a link to rust-lang/rust#96710 which I think is worth copying here.

These libcalls were removed in #4470 and I copied their source from just before that commit.

Apparently nearest has a slightly storied history:

Fix remaining failures and re-enable x64 new backend CI #2219 fixed it for NaN cases

runtime: new implementations for nearest lib calls #2171 made the implementation more efficient based on musl

Implement a wast test harness #34 was the original implementation

I ran cargo run wast ./tests/spec_testsuite/f32.wast --cranelift-set has_sse41=false --wasm-features=-simd and commented out the "Check for NaNs" block and the tests failed, so the block is definitely load-bearing. That may mean that Cranelift's implementation needs to be updated to account for NaN's perhaps? (I don't really understand the actual underlying code here I'm just copying bytes)

Be sure to factor in the offset of the function itself

jameysharp

Thanks for addressing my questions! Still looks good to me. 👍

A fuzz bug came in last night from bytecodealliance#5567 where spectest fuzzing will first generate a config, possibly with SSE features for SIMD disabled, only to have SIMD later enabled by `set_spectest_compliant`. This commit fixes the issue by changing to `is_spectest_compliant` as a query and throwing out the fuzz case if it isn't. This means that the spectest fuzzer will throw out more inputs but means we can continue to generate interesting configs and such for other inputs.

A fuzz bug came in last night from #5567 where spectest fuzzing will first generate a config, possibly with SSE features for SIMD disabled, only to have SIMD later enabled by `set_spectest_compliant`. This commit fixes the issue by changing to `is_spectest_compliant` as a query and throwing out the fuzz case if it isn't. This means that the spectest fuzzer will throw out more inputs but means we can continue to generate interesting configs and such for other inputs.

alexcrichton marked this pull request as ready for review January 12, 2023 23:20

github-actions bot added the fuzzing Issues related to our fuzzing infrastructure label Jan 12, 2023

jameysharp approved these changes Jan 13, 2023

View reviewed changes

alexcrichton added 3 commits January 17, 2023 09:06

Fix log message in wast support

90443a3

Fix offset listed in relocations

6c53b45

Be sure to factor in the offset of the function itself

Review comments

47e59f4

jameysharp approved these changes Jan 17, 2023

View reviewed changes

alexcrichton merged commit 9b896d2 into bytecodealliance:main Jan 18, 2023

alexcrichton deleted the libcalls branch January 18, 2023 15:04

alexcrichton mentioned this pull request Jan 19, 2023

Change spectest fuzzing to throw out some fuzz inputs #5597

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolve libcall relocations for older CPUs #5567

Resolve libcall relocations for older CPUs #5567

alexcrichton commented Jan 12, 2023

alexcrichton commented Jan 12, 2023

github-actions bot commented Jan 12, 2023

jameysharp left a comment

jameysharp Jan 13, 2023

alexcrichton Jan 17, 2023

jameysharp Jan 13, 2023

alexcrichton Jan 17, 2023

jameysharp left a comment

Resolve libcall relocations for older CPUs #5567

Resolve libcall relocations for older CPUs #5567

Conversation

alexcrichton commented Jan 12, 2023

alexcrichton commented Jan 12, 2023

github-actions bot commented Jan 12, 2023

Subscribe to Label Action

jameysharp left a comment

Choose a reason for hiding this comment

jameysharp Jan 13, 2023

Choose a reason for hiding this comment

alexcrichton Jan 17, 2023

Choose a reason for hiding this comment

jameysharp Jan 13, 2023

Choose a reason for hiding this comment

alexcrichton Jan 17, 2023

Choose a reason for hiding this comment

jameysharp left a comment

Choose a reason for hiding this comment