Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix some i128 shift-related bugs in x64 backend. #2682

Merged
merged 2 commits into from
Feb 26, 2021

Conversation

cfallin
Copy link
Member

@cfallin cfallin commented Feb 23, 2021

This fixes #2672 and #2679, and also fixes an incorrect instruction
emission (test with small immediate) that we had missed earlier.

The shift-related fixes have to do with (i) shifts by 0 bits, as a
special case that must be handled; and (ii) shifts by a 128-bit amount,
which we can handle by just dropping the upper half (we only use 3--7
bits of shift amount).

This adjusts the lowerings appropriately, and also adds run-tests to
ensure that the lowerings actually execute correctly (previously we only
had compile-tests with golden lowerings; I'd like to correct this for
more ops eventually, adding run-tests beyond what the Wasm spec and
frontend covers).

This fixes bytecodealliance#2672 and bytecodealliance#2679, and also fixes an incorrect instruction
emission (`test` with small immediate) that we had missed earlier.

The shift-related fixes have to do with (i) shifts by 0 bits, as a
special case that must be handled; and (ii) shifts by a 128-bit amount,
which we can handle by just dropping the upper half (we only use 3--7
bits of shift amount).

This adjusts the lowerings appropriately, and also adds run-tests to
ensure that the lowerings actually execute correctly (previously we only
had compile-tests with golden lowerings; I'd like to correct this for
more ops eventually, adding run-tests beyond what the Wasm spec and
frontend covers).
@cfallin cfallin requested a review from abrown February 23, 2021 22:27
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen labels Feb 23, 2021
@bjorn3
Copy link
Contributor

bjorn3 commented Feb 25, 2021

[legion/src/internals/insert.rs:54] component_index = 0
[legion/src/internals/insert.rs:54] 1u128 << dbg!(component_index) = 18446744073709551617

It didn't help unfortunately.

Add a bunch of test vectors that actually expose this (previously the
shift-by-zero test had equal lower and upper halves and hid the bug),
including the most basic of all, 1 << 0 == 1 (thanks @bjorn3 for finding
this).
@cfallin
Copy link
Member Author

cfallin commented Feb 25, 2021

Ah, yes, you were right; I confused tmp2 for tmp3 (a classic mistake!) and failed to actually use any test vectors with differing lower and upper halves. Should be fixed now (and added a few more test cases). Thanks!

Copy link
Contributor

@bjorn3 bjorn3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This version works.

Copy link
Contributor

@abrown abrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks fine. (As an aside, why aren't we using SSE2's PSLLDQ/PSRLDQ instructions instead of these long sequences? I haven't looked at much of the i128 code but it would seem that moving upper and lower halves to XMMs and back might still be faster for one of these cases?)

@cfallin
Copy link
Member Author

cfallin commented Feb 26, 2021

I think this looks fine. (As an aside, why aren't we using SSE2's PSLLDQ/PSRLDQ instructions instead of these long sequences? I haven't looked at much of the i128 code but it would seem that moving upper and lower halves to XMMs and back might still be faster for one of these cases?)

Ah, the simple answer is that I don't know SSE well enough to reach for such instructions -- though it looks like they should work much more efficiently than these sequences! I'll go ahead and merge with your +1 for now so that we have correct results; but we can definitely improve this later. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cranelift: ishl.i8 with i128 shift amount panics
3 participants