Add splat simplify opt for various ops #6851

gurry · 2023-08-16T04:03:55Z

Fixes #6828

github-actions · 2023-08-16T05:44:49Z

Subscribe to Label Action

cc @cfallin, @fitzgen

This issue or pull request has been labeled: "cranelift", "isle"

Thus the following users have been cc'd because of the following labels:

cfallin: isle
fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

namely, popcnt, smin, umin, smax, umax, sadd_sat, uadd_sat, ssub_sat & usub_sat

and avg_round as well

gurry · 2023-08-17T08:58:36Z

@afonso360 The following two wasmtime spec tests are failing for the PR:

wast::Cranelift::spec::simd_splat
wast::Cranelift::spec::simd_splat_pooling

They are failing for the following operators: sadd_sat, usub_sat & sshr. If I remove splat simplify opts for these operators the tests pass.

Can you please help me find the root cause of this failure? Do I need to tweak the opts for the above mentioned operators somehow to make the tests pass?

afonso360 · 2023-08-17T10:53:37Z

This ended up being quite a long reply, sorry! I just wanted to note down the steps that I use to debug these sort of stuff, it might be useful in the future! (There's a TLDR at the bottom)

So, running the tests individually shows us what went wrong. These test suites contain a bunch of individual tests so there is more than one failure. Here's what I got:

running 1 test
test wast::Cranelift::spec::simd_splat_pooling ... FAILED

failures:

---- wast::Cranelift::spec::simd_splat_pooling stdout ----
thread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', cranelift/codegen/src/isa/x64/lower/isle.rs:550:21
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' panicked at 'should be implemented in ISLE: inst = `v10 = sadd_sat.i8 v5, v7`, type = `Some(types::I8)`', cranelift/codegen/src/machinst/lower.rs:749:21
thread '<unnamed>' panicked at 'should be implemented in ISLE: inst = `v11 = sadd_sat.i16 v5, v7`, type = `Some(types::I16)`', cranelift/codegen/src/machinst/lower.rs:749:21
thread '<unnamed>' panicked at 'should be implemented in ISLE: inst = `v10 = usub_sat.i8 v5, v7`, type = `Some(types::I8)`', cranelift/codegen/src/machinst/lower.rs:749:21
thread '<unnamed>' panicked at 'should be implemented in ISLE: inst = `v11 = usub_sat.i16 v5, v7`, type = `Some(types::I16)`', cranelift/codegen/src/machinst/lower.rs:749:21
thread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', cranelift/codegen/src/isa/x64/lower/isle.rs:550:21
thread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', cranelift/codegen/src/isa/x64/lower/isle.rs:550:21

Starting with the easy ones: sadd_sat.{i8,i16} and usub_sat.{i8,i16} are not implemented in ISLE. That means that we haven't yet implemented a machine code lowering for them in the x64 backend (ISLE is just the framework that we use for this!). But more importantly when reading the docs it looks like they aren't supported for scalars!

This sometimes happens, we allow an operation for vectors but not for scalars. Usually because they don't make sense for scalars, I think in this case it's just that no one has felt the need to implement it yet.

I think we should be able to remove the optimizations for both of these opcodes, and the rest of the *_sat ones as well since they look like they only support vectors.

Now onto the hard one. I ran the testsuite individually with RUST_LOG=trace cargo run -- wast --disable-cache tests/spec_testsuite/simd_splat.wast and pulled one of the offending functions out of the log. Essentially I just looked for any function with sshr since I didn't expect there to be duplicates in these files.

I got this function out (I've slightly cleaned it up):

function u0:22(i64 vmctx, i64, i32, i32) -> i8x16 fast {
block0(v0: i64, v1: i64, v2: i32, v3: i32):
    v5 = ireduce.i8 v2
    v6 = splat.i8x16 v5
    v7 = sshr v6, v3
    return v7
}

It's really neat that it matches our optimization pretty much exactly! After optimizations it looks like this:

block0(v0: i64, v1: i64, v2: i32, v3: i32):
    v5 = ireduce.i8 v2
    v8 = sshr v5, v3
    v9 = splat.i8x16 v8
    return v9

So, now that we have a small reproducible example of what is going wrong, lets debug it. My first step was compiling both the pre-optimization function and the post optimization function. And something immediately went wrong!

The optimized version compiles cleanly! Which is really weird, since I got that out of the optimizer from the first version. I haven't seen this before but I do have an idea of what might be going wrong.

We don't print the type of the sshr in the CLIF code. What I think might be happening is that we are saying to CLIF: Build me an sshr that takes an i8 and i32 and produces an i8x16 and things blow up later in the pipeline.

Changing the optimization source to explicitly request the lane_type fixes it! We are now requesting an i8 out of the sshr

(rule (simplify (sshr ty (splat ty x) y))
      (splat ty (sshr (lane_type ty) x y)))

This is a really weird issue, and it probably should have been caught earlier in the pipeline instead of letting it propagate all the way to the backend which really just confuses things.

This relied a lot on intuition and also knowledge of how cranelift works, so don't worry if it's hard to figure it out! We really should have had better error messages to help people out on these issues.

TLDR:

We should remove the {u,s}{add,sub}_sat optimizations since these opcodes only exist for vectors but not for scalars.
We should pass in the lane_type into the sshr operation, otherwise the compiler gets really confused. (this also applies to ushr/ishl/rotr/rotl)

because they don't support scalars.

Some of the wasmtime spec tests were failing without this

gurry · 2023-08-17T12:01:22Z

Thanks @afonso360 for the detailed explanation and showing how to debug such failures in the future 👍.

I've made the suggested changes now. Let's hope the checks pass.

gurry · 2023-08-17T12:31:33Z

Checks passed. Please review.

cranelift/codegen/src/opts/vector.isle

afonso360

This looks great! Thanks for working through this.

I've had a look at the rest of our opcodes and I don't think we are missing any of them (at least for integers)!

I think it's possible to do a transform from splat+icmp into icmp+bmask+splat, but leaving that for a future PR is also OK! I'm also not sure if this is a good transform to do. Since I think it would be the only one that actually adds instructions rather than reducing them.

I've also run our fuzzer on this PR for a couple of hours and nothing has come up.

jameysharp · 2023-08-18T00:40:42Z

I assume you'll merge this, @afonso360? This PR looks great to me, thank you @gurry!

I feel like all these bugs should have been caught by the CLIF verifier, which I guess means that the verifier did not run after the egraph pass. Would that have helped more quickly pin down the causes of these bugs? Actually I'm confused about when the verifier runs and whether we already have a way to turn it on. Maybe we should enable it for all filetests?

jameysharp · 2023-08-18T00:42:32Z

And the very next thing in my queue of GitHub notifications was #6855 which says exactly that, thank you @afonso360 😆

because it doesn't support scalars.

* Add splat simplify opt for iadd, isub, imul, ineg & iabs * Add splat simplify opt for smulhi & hmulhi * Add splat simplify opt for band, bor & bxor * Add splat simplify opt for a few more ops namely, popcnt, smin, umin, smax, umax, sadd_sat, uadd_sat, ssub_sat & usub_sat * Add splat simplify opt for bnot * Add splat simplify opt for shift and rotate ops and avg_round as well * Remove splat simplify opt for certain ops because they don't support scalars. * Add lane_type to splat opts for shift/rotate Some of the wasmtime spec tests were failing without this * Remove splat simplify opt for avg_round because it doesn't support scalars.

gurry added 3 commits August 16, 2023 09:04

Add splat simplify opt for iadd, isub, imul, ineg & iabs

9852cdf

Add splat simplify opt for smulhi & hmulhi

8fc064b

Add splat simplify opt for band, bor & bxor

d73efe4

github-actions bot added cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language labels Aug 16, 2023

gurry added 3 commits August 16, 2023 17:00

Add splat simplify opt for a few more ops

0baf44b

namely, popcnt, smin, umin, smax, umax, sadd_sat, uadd_sat, ssub_sat & usub_sat

Add splat simplify opt for bnot

4980256

Add splat simplify opt for shift and rotate ops

2445a26

and avg_round as well

gurry marked this pull request as ready for review August 17, 2023 02:51

gurry requested a review from a team as a code owner August 17, 2023 02:51

gurry requested review from elliottt and removed request for a team August 17, 2023 02:51

gurry marked this pull request as draft August 17, 2023 02:54

afonso360 mentioned this pull request Aug 17, 2023

cranelift: Run verifier after optimizations during testing #6855

Closed

gurry added 2 commits August 17, 2023 17:10

Remove splat simplify opt for certain ops

b128440

because they don't support scalars.

Add lane_type to splat opts for shift/rotate

020ec66

Some of the wasmtime spec tests were failing without this

gurry marked this pull request as ready for review August 17, 2023 12:30

afonso360 reviewed Aug 17, 2023

View reviewed changes

cranelift/codegen/src/opts/vector.isle Outdated Show resolved Hide resolved

afonso360 approved these changes Aug 17, 2023

View reviewed changes

Remove splat simplify opt for avg_round

8783b0d

because it doesn't support scalars.

afonso360 added this pull request to the merge queue Aug 18, 2023

Merged via the queue into bytecodealliance:main with commit 6650378 Aug 18, 2023

gurry deleted the issue-6828 branch August 18, 2023 09:10

afonso360 mentioned this pull request Aug 26, 2023

egraphs: Disable bitwise splat transform for floats #6916

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add splat simplify opt for various ops #6851

Add splat simplify opt for various ops #6851

gurry commented Aug 16, 2023

github-actions bot commented Aug 16, 2023

gurry commented Aug 17, 2023

afonso360 commented Aug 17, 2023

gurry commented Aug 17, 2023

gurry commented Aug 17, 2023

afonso360 left a comment •

edited

Loading

jameysharp commented Aug 18, 2023

jameysharp commented Aug 18, 2023

Add splat simplify opt for various ops #6851

Add splat simplify opt for various ops #6851

Conversation

gurry commented Aug 16, 2023

github-actions bot commented Aug 16, 2023

Subscribe to Label Action

gurry commented Aug 17, 2023

afonso360 commented Aug 17, 2023

gurry commented Aug 17, 2023

gurry commented Aug 17, 2023

afonso360 left a comment • edited Loading

Choose a reason for hiding this comment

jameysharp commented Aug 18, 2023

jameysharp commented Aug 18, 2023

afonso360 left a comment •

edited

Loading