Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WASM float to int performance regression since 1.53.0 #87643

Open
CryZe opened this issue Jul 30, 2021 · 5 comments
Open

WASM float to int performance regression since 1.53.0 #87643

CryZe opened this issue Jul 30, 2021 · 5 comments
Labels
C-bug Category: This is a bug. O-wasm Target: WASM (WebAssembly), http://webassembly.org/

Comments

@CryZe
Copy link
Contributor

CryZe commented Jul 30, 2021

Both Rust and WASM introduced saturating float to int casts. However WASM originally only had trapping float to int casts. LLVMs internal float casts are speculatable, i.e. it can execute them early, assuming no trap ever happens. This however means LLVM needs to protect itself from WASM's trapping casts by emitting some more code around it. Once Rust introduced saturating float to int casts, rustc itself started emitting a bunch of code around the casts to saturate the values. This then led to both rustc and LLVM emitting this guard code around each cast. However since rustc already protected itself some dangerous values, LLVM didn't need to emit any of these additional instructions. This was eventually implemented. Check this previous issues and related PRs: #73591

However with the switch to LLVM 12, it was possible to throw out a lot of the manual codegen in rustc:
#84339 which acknowledges a regression in those casts

and then a follow up PR:
#84654 which supposedly fixes the regression

However it seems like there's still a regression: https://rust.godbolt.org/z/W18vGcv9T

cc @alexcrichton

@CryZe CryZe added the C-bug Category: This is a bug. label Jul 30, 2021
@CryZe
Copy link
Contributor Author

CryZe commented Jul 30, 2021

Though looking at the LLVM IR it seems like the difference is that 1.52.0 is our own rustc saturation implementation and since 1.53 it's LLVM 12's saturation implementation, which seems to be worse?

@alexcrichton
Copy link
Member

alexcrichton commented Jul 30, 2021

Thanks for the heads up, this is behaving as "expected" although the expectation wasn't really thoroughly evaluated by me.

As you've discovered the main difference (which you can see in that diff view with --emit llvm-ir) is that with Rust 1.52 and prior (pre-those PRs) rustc manually emitted LLVM IR that did the right thing. Rust 1.53.0 and later we're calling the LLVM intrinsic for saturating float-to-int conversion.

In that sense codegen is behaving as expected, and I believe at the time I diff'd the two (rustc's old codegen and LLVM's built-in intrinsic codegen) and saw they were different but assume that the difference was negligible. Have you measured the LLVM-intrinsic-generated code to have worse performance? (I see it has a few extra instructions but I'd be curious to put concrete numbers on it if possible)

If LLVM has worse performance I think it'd be good to open an issue upstream with them and see if we can improve it upstream, but if it's critical and it's too difficult to land upstream then we can perhaps re-land the wasm-specific bits for rustc.

(it's also be best if Safari implemented the nontrapping-fptoint extensions so we could consider turning that on by default...)

@CryZe
Copy link
Contributor Author

CryZe commented Jul 30, 2021

I didn't do any benchmarks, but it seems like it suffers from the same problem as our original codegen. This is roughly the new WASM translated into Rusty pseudo code:

// General ifs to do the saturation
if x.is_nan() {
    0
} else if x >= 0x1.fffffep30 {
    2147483647
} else if x >= -0x1p31 {
    // Protection against trapping
    if x.abs() < 0x1p31 {
        (int)x
    } else {
        -2147483648
    }
} else {
    -2147483648
}

This protection against trapping shouldn't be there, the saturation code already checked for all the edge cases (though technically a lot of these are selects, which they can't be if you remove the protection code, which may have some performance implications?). This is probably because this lowering of the saturation casts is backend independent and yet again the WASM backend doesn't know anything about it, so it still protects itself from the trapping. So yeah this definitely can be improved. I'll look into raising an upstream issue I guess.

It's not critical at all, it's just something that we stumbled upon in some Twitter discussion.

@adrian17
Copy link

adrian17 commented Jul 31, 2021

Both Rust and WASM introduced saturating float to int casts

Don't you need to opt-in on the wasm side with -C target-feature=+nontrapping-fptoint ? After adding this to your godbolt link, on nightly (but not 1.52), it'll compile down to:

example::cast:
        local.get       0
        i32.trunc_sat_f32_s
        end_function

(of course it doesn't change that without this feature, there's a regression)

@CryZe
Copy link
Contributor Author

CryZe commented Jul 31, 2021

Yeah this issue is only concerned with not having the WASM feature active. It should still produce reasonably good code then. Atm it's not as good as before 1.53.

@sanxiyn sanxiyn added the O-wasm Target: WASM (WebAssembly), http://webassembly.org/ label Dec 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. O-wasm Target: WASM (WebAssembly), http://webassembly.org/
Projects
None yet
Development

No branches or pull requests

4 participants