Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x64: Lower widening and narrowing operations in ISLE #4722

Merged
merged 6 commits into from
Aug 18, 2022

Conversation

elliottt
Copy link
Member

Lower uwiden_high, uwiden_low, swiden_high, swiden_low, snarrow, and unarrow in ISLE.

@elliottt elliottt marked this pull request as ready for review August 16, 2022 20:24
@elliottt elliottt added the isle Related to the ISLE domain-specific language label Aug 16, 2022
@github-actions
Copy link

Subscribe to Label Action

cc @cfallin, @fitzgen

This issue or pull request has been labeled: "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen labels Aug 16, 2022
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This generally looks good, except I think we need to resolve the pending (pre-existing) confusion below before we merge.


;; TODO: The type we are expecting as input as actually an F64X2 but the
;; instruction is only defined for integers so here we use I64X2. This is a
;; separate issue that needs to be fixed in instruction.rs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is pre-existing but it seems to me we should clarify this before we merge; what exactly does this mean? Is there something in the Wasm translator that generates code with an explicitly incorrect type?

Likewise ignoring the second argument below seems outright wrong: if the semantics of snarrow are that it combines lanes of the two vectors, the lanes of b should show up somewhere.

I suspect this is a weird shoehorning of different semantics into a particular combo of ops (ie, the frontend generates exactly this shape, and the backend agrees that it means something different than what the actual instruction semantics, composed, would mean. We should fix this if so as it's a correctness bug waiting to happen!

A way to find out if this is the case is: what happens if we remove this specialization? We have generic lowerings for snarrow and fcvt_to_sint_sat; do those two lowerings, composed together, give a correct result for whatever code is meant to hit this rule?

cc @abrown @jlb6740 for more thoughts on this (you may remember more history here?)

Copy link
Member Author

@elliottt elliottt Aug 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment from lower.rs might shed some light: it seems like this combination of snarrow and fcvt_to_sint_sat is implementing the behavior of i32x4.trunc_sat_f64x2_s_zero:

//y = i32x4.trunc_sat_f64x2_s_zero(x) is lowered to:
//MOVE xmm_tmp, xmm_x
//CMPEQPD xmm_tmp, xmm_x
//MOVE xmm_y, xmm_x
//ANDPS xmm_tmp, [wasm_f64x2_splat(2147483647.0)]
//MINPD xmm_y, xmm_tmp
//CVTTPD2DQ xmm_y, xmm_y

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's where the instructions that this special case handles are introduced:
https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/wasm/src/code_translator.rs#L1789-L1795

The second argument to snarrow is always all zeros, so perhaps we could check that as well to better catch this case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still feel like I don't fully understand what's going on here; unfortunately that comment doesn't clarify much for me. Why is there a type mismatch if we're composing the float-to-int conversion (which produces an I64x2) with an int-to-int narrow? In other words what change is the TODO proposing in the instruction definitions?

I'm also curious whether removing this special case results in correct execution still; if not then that's a sign that we're trying to fit extra meaning into the combo here that shouldn't exist.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment that I migrated over on line 3282 is suggesting that the intent of this lowering is to lower the trunc_sat instance I mentioned above, thus the type mismatch as trunc_sat takes a floating point number and produces an integer. The comment from lower.rs backs this up by suggesting that this is the translation of i32x4.trunc_sat_f64x2_s_zero, and the translation of that operation produces the sequence that this rule matches.

I think that if we implement the signed and unsigned narrowings for I64X2 we could probably remove this special case and have the general case work, but just removing it now will cause test failures. It might be worth keeping the special case if it turns out that it avoids some unnecessary work, and i32x4.trunc_sat_f64x2_{s,u}_zero is a common enough operation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've clarified the purpose of the special case, and restricted it further to cases where the rhs of the snarrow is a zero vector which should match the code generated by the front end. I've also filed #4734 about the missing i64x2 cases for snarrow and unarrow.

@elliottt elliottt requested a review from cfallin August 18, 2022 17:59
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants