-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x64: Lower widening and narrowing operations in ISLE #4722
Conversation
Subscribe to Label Action
This issue or pull request has been labeled: "isle"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This generally looks good, except I think we need to resolve the pending (pre-existing) confusion below before we merge.
|
||
;; TODO: The type we are expecting as input as actually an F64X2 but the | ||
;; instruction is only defined for integers so here we use I64X2. This is a | ||
;; separate issue that needs to be fixed in instruction.rs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is pre-existing but it seems to me we should clarify this before we merge; what exactly does this mean? Is there something in the Wasm translator that generates code with an explicitly incorrect type?
Likewise ignoring the second argument below seems outright wrong: if the semantics of snarrow
are that it combines lanes of the two vectors, the lanes of b
should show up somewhere.
I suspect this is a weird shoehorning of different semantics into a particular combo of ops (ie, the frontend generates exactly this shape, and the backend agrees that it means something different than what the actual instruction semantics, composed, would mean. We should fix this if so as it's a correctness bug waiting to happen!
A way to find out if this is the case is: what happens if we remove this specialization? We have generic lowerings for snarrow
and fcvt_to_sint_sat
; do those two lowerings, composed together, give a correct result for whatever code is meant to hit this rule?
cc @abrown @jlb6740 for more thoughts on this (you may remember more history here?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment from lower.rs
might shed some light: it seems like this combination of snarrow
and fcvt_to_sint_sat
is implementing the behavior of i32x4.trunc_sat_f64x2_s_zero
:
wasmtime/cranelift/codegen/src/isa/x64/lower.rs
Lines 701 to 707 in 0a71df6
//y = i32x4.trunc_sat_f64x2_s_zero(x) is lowered to: | |
//MOVE xmm_tmp, xmm_x | |
//CMPEQPD xmm_tmp, xmm_x | |
//MOVE xmm_y, xmm_x | |
//ANDPS xmm_tmp, [wasm_f64x2_splat(2147483647.0)] | |
//MINPD xmm_y, xmm_tmp | |
//CVTTPD2DQ xmm_y, xmm_y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's where the instructions that this special case handles are introduced:
https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/wasm/src/code_translator.rs#L1789-L1795
The second argument to snarrow
is always all zeros, so perhaps we could check that as well to better catch this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still feel like I don't fully understand what's going on here; unfortunately that comment doesn't clarify much for me. Why is there a type mismatch if we're composing the float-to-int conversion (which produces an I64x2) with an int-to-int narrow? In other words what change is the TODO proposing in the instruction definitions?
I'm also curious whether removing this special case results in correct execution still; if not then that's a sign that we're trying to fit extra meaning into the combo here that shouldn't exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment that I migrated over on line 3282 is suggesting that the intent of this lowering is to lower the trunc_sat instance I mentioned above, thus the type mismatch as trunc_sat takes a floating point number and produces an integer. The comment from lower.rs backs this up by suggesting that this is the translation of i32x4.trunc_sat_f64x2_s_zero
, and the translation of that operation produces the sequence that this rule matches.
I think that if we implement the signed and unsigned narrowings for I64X2
we could probably remove this special case and have the general case work, but just removing it now will cause test failures. It might be worth keeping the special case if it turns out that it avoids some unnecessary work, and i32x4.trunc_sat_f64x2_{s,u}_zero
is a common enough operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've clarified the purpose of the special case, and restricted it further to cases where the rhs of the snarrow
is a zero vector which should match the code generated by the front end. I've also filed #4734 about the missing i64x2
cases for snarrow
and unarrow
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now, thanks!
Lower
uwiden_high
,uwiden_low
,swiden_high
,swiden_low
,snarrow
, andunarrow
in ISLE.