[wasm] Optimize constant i2/i4 shuffles in jiterpreter #86470

kg · 2023-05-18T23:13:42Z

(draft because blocked by #86469)
For known-constant shuffle vectors, the jiterpreter can transform the i2/i4 indices into a byte shuffle vector at JIT time and encode it directly into the trace. In my testing this speeds up span Reverse on chars a bit. Not sure if it's a good idea to do this, so feedback is appreciated.

ghost · 2023-05-18T23:13:47Z

Tagging subscribers to 'arch-wasm': @lewing
See info in area-owners.md if you want to be subscribed.

Issue Details

(draft because blocked by #86469)
For known-constant shuffle vectors, the jiterpreter can transform the i2/i4 indices into a byte shuffle vector at JIT time and encode it directly into the trace. In my testing this speeds up span Reverse on chars a bit. Not sure if it's a good idea to do this, so feedback is appreciated.

Author:	kg
Assignees:	-
Labels:	`arch-wasm`, `area-Codegen-Jiterpreter-mono`
Milestone:	-

kg · 2023-05-19T14:57:17Z

@BrzVlad @kotlarmilos How do you feel about optimizing the constant shuffles this way? It seems like it might be best to do this in interp, but it's not clear to me how much work it would be to do the analysis there. The jiterpreter would still need to do the lowering, since on the C side and on non-wasm platforms we still want to generate the opcodes for I2/I4 shuffle. So the jiterp would need some way to know that the indices are constant and know what the indices are.

The relevant part of reverse chars looks like this:

dotnet.runtime.js:3 MONO_WASM: 326a0fc ldobj.vt 96 -> 128
dotnet.runtime.js:3 MONO_WASM: 326a104 simd_v128_ldc  -> 160
dotnet.runtime.js:3 MONO_WASM: 326a118 V128_I2_SHUFFLE 112, 160 -> 112
dotnet.runtime.js:3 MONO_WASM: 326a122 simd_v128_ldc  -> 160
dotnet.runtime.js:3 MONO_WASM: 326a136 V128_I2_SHUFFLE 128, 160 -> 128
dotnet.runtime.js:3 MONO_WASM: 326a140 stobj.vt.noref 88, 128
dotnet.runtime.js:3 MONO_WASM: 326a148 stobj.vt.noref 96, 112

Maybe we could define new SHUFFLE_CONSTANT opcodes that the jiterp consumes, and generate them by doing a peephole optimization?

kg · 2023-05-19T16:40:07Z

Rebasing onto #86506 since there would be a merge conflict. Timings with both applied:

measurement	main	#86506	both PRs
Span, Reverse bytes	0.0213ms	0.0139ms	0.0126ms
Span, Reverse chars	0.0418ms	0.0256ms	0.0228ms

EDIT: I'll note that from reading v8's source code, they have optimizations that kick in when they can detect a constant indices vector, so it makes sense that we see a speedup here.

BrzVlad · 2023-05-19T17:11:30Z

I strongly suggest any kind of constant tracking to be done within the interpreter.

kg · 2023-05-19T17:23:25Z

I strongly suggest any kind of constant tracking to be done within the interpreter.

OK. I'll leave this no merge until we figure out how we want to handle it, and we can remove the other constant tracking when we do that.

kg · 2023-06-01T03:03:42Z

My recollection of the conversations I've had about this is as follows:

We want to heavily limit how much the jiterpreter does this category of optimization, but narrowly scoped cases like this (the immediate is from a preceding opcode) are OK
In the future we want to introduce a variant of the SIMD opcodes that accepts an immediate - we need it for ExtractLane/ReplaceLane anyway, so it would be natural to apply it to this case as well
Once we have 'immediate' SIMD opcodes, we can introduce immediate versions of shuffle and then the jiterpreter can be updated to consume those opcodes instead of detecting the constant shuffle itself
I'll probably rebase this soon and land it since I'm not aware of any reason to block it at present.

Introduce builder v128_const method v8 doesn't optimize splats so use the enormous encoding for v128 zero Fix fast memset for nonzero values Detect constant shuffle vectors for i2/i4 shuffles and expand them to byte shuffle vectors at JIT time Also optimize i1 shuffles

kg added arch-wasm WebAssembly architecture area-Codegen-Jiterpreter-mono labels May 18, 2023

ghost assigned kg May 18, 2023

kg force-pushed the wasm-jiterp-constant-shuffles branch from 2c855bb to 5e5f0de Compare May 19, 2023 00:31

kg marked this pull request as ready for review May 19, 2023 14:55

kg requested review from lewing and pavelsavara as code owners May 19, 2023 14:55

kg requested review from BrzVlad and kotlarmilos May 19, 2023 14:55

kg added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label May 19, 2023

kg force-pushed the wasm-jiterp-constant-shuffles branch from 5e5f0de to b5977a6 Compare May 19, 2023 16:40

kg requested a review from vargaz as a code owner May 19, 2023 16:52

vargaz approved these changes May 19, 2023

View reviewed changes

This was referenced May 19, 2023

Failed USB connection via port 54050, error 61, in tvOS arm64 Release AllSubsets_Mono #82637

Open

Build ios-arm64 Release AllSubsets_Mono failures dotnet/arcade#13625

Closed

BrzVlad approved these changes Jun 1, 2023

View reviewed changes

kg force-pushed the wasm-jiterp-constant-shuffles branch from acd933e to 07e8544 Compare June 1, 2023 08:42

This was referenced Jun 1, 2023

Tracking issue for CI build timeouts #76454

Closed

Wasm.Build.Tests.Blazor tests could not find blazor's dotnetjs #87024

Closed

kg merged commit d675add into dotnet:main Jun 1, 2023

build-analysis bot mentioned this pull request Jun 2, 2023

Failing test System.Net.Quic.Tests.MsQuicPlatformDetectionTests.SupportedLinuxPlatforms #87038

Closed

ghost locked as resolved and limited conversation to collaborators Jul 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[wasm] Optimize constant i2/i4 shuffles in jiterpreter #86470

[wasm] Optimize constant i2/i4 shuffles in jiterpreter #86470

kg commented May 18, 2023

ghost commented May 18, 2023

kg commented May 19, 2023

kg commented May 19, 2023 •

edited

Loading

BrzVlad commented May 19, 2023

kg commented May 19, 2023

kg commented Jun 1, 2023

[wasm] Optimize constant i2/i4 shuffles in jiterpreter #86470

[wasm] Optimize constant i2/i4 shuffles in jiterpreter #86470

Conversation

kg commented May 18, 2023

ghost commented May 18, 2023

kg commented May 19, 2023

kg commented May 19, 2023 • edited Loading

BrzVlad commented May 19, 2023

kg commented May 19, 2023

kg commented Jun 1, 2023

kg commented May 19, 2023 •

edited

Loading