-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[wasm] Enable v128 and PackedSimd in the interpreter #85920
Conversation
Tagging subscribers to 'arch-wasm': @lewing Issue DetailsRight now both v128 and packedsimd are disabled for WASM because of an issue with test results xml getting cut off at the start/end. Testing turning them back on now that some other fixes have landed. It only seems to affect testResults.xml and it's like the first 1-8 bytes or last 1-8 bytes of the xml going missing at some point in the flow from the test runner writing it to stdout and the harness capturing it. I've been struggling to come up with a plausible theory for how we could end up with missing bytes due to vectorization being turned on, since I wouldn't expect 'copy 100085 bytes from this buffer to stdout' to depend on something like PackedSimd or Vector128 since it's largely going to be memmoves of fixed sizes.
|
cc @stephentoub have you seen anything like this before? Perhaps when some of the BCL vectorization work was previously in progress? I'm totally stumped. If you look at the test failures you can see the test suites pass but then the result data was too short and corrupted |
My latest theory is that the problem is https://github.com/dotnet/xharness/blob/aa434d0c7e6eb46df1ec11b3c63add37d835c4d0/src/Microsoft.DotNet.XHarness.TestRunners.Xunit/ThreadlessXunitTestRunner.cs#LL125C34-L125C48 which is why the test suites themselves are passing. It looks like the base64 implementation is vectorized if hw accelerated, which explains why this causes it to break, and it also sort of explains why only a few characters at the start/end are missing |
efa3db7
to
db8875d
Compare
The issue appears to be specifically caused by enabling the jiterpreter's implementation of Vector128.CreateScalar. Not sure what's wrong with it or why tests don't fail, but disabling those opcodes specifically seems to make the xharness problem go away. |
Since I've determined that the PackedSimd issue is blocked on interp changes, I'm going to move all the bug fixes from that PR into this one so we can merge those bug fixes with this. |
Due to how trimming works, it's not possible to have runtime options for interpreter SIMD support or make PackedSimd optional - the trimmer overrides Vector128 and PackedSimd support to 'true' (which is correct for AOT code!) and the interp has no ability to override that. Right now on main things work because V128 IsHardwareAccelerated is false, but once it flips to true, PackedSimd methods will just get called no matter what. So this will have to be a V128+PackedSimd all-in-one PR, and that means it's blocked on some interp bug fixes. |
If we want to remove this limitation I think it is possible to do so |
Radek pointed out that we can disable the substitutions for interp-only, so I'm going to do that. But it seems unfixable for AOT since AOT code can transition into the interp at any time |
Fix (?) the create scalar intrinsics in the jiterpreter Fix trimming removing SIMD even if SIMD was enabled Add wasm opcode mapping for extract msb Fix some SN_ lists in transform-simd being ordered incorrectly Due to mixed mode it is not possible to control interp SIMD with a runtime option, so remove the options Disable linker substitutions for SIMD unless AOT is active, so the interp is responsible for the properties Generate a MINT_NIY for unimplemented PackedSimd methods
Now that the debug vector alignment fix has landed on main, I expect this to pass tests on CI without that 'testResults.xml's prologue goes missing' problem. |
Failures are unrelated. Merging. |
EDIT: This PR enables vector128 SIMD and PackedSimd in the interpreter and jiterpreter for wasm, and fixes a few bugs/missing opcodes. It also changes how we do trimming around SIMD.
EDIT: Original description below
Right now both v128 and packedsimd are disabled for WASM because of an issue with test results xml getting cut off at the start/end. Testing turning them back on now that some other fixes have landed.
It only seems to affect testResults.xml and it's like the first 1-8 bytes or last 1-8 bytes of the xml going missing at some point in the flow from the test runner writing it to stdout and the harness capturing it. I've been struggling to come up with a plausible theory for how we could end up with missing bytes due to vectorization being turned on, since I wouldn't expect 'copy 100085 bytes from this buffer to stdout' to depend on something like PackedSimd or Vector128 since it's largely going to be memmoves of fixed sizes.