x64: Lower widening and narrowing operations in ISLE #4722

elliottt · 2022-08-16T19:45:34Z

Lower uwiden_high, uwiden_low, swiden_high, swiden_low, snarrow, and unarrow in ISLE.

github-actions · 2022-08-16T22:56:29Z

Subscribe to Label Action

cc @cfallin, @fitzgen

This issue or pull request has been labeled: "isle"

Thus the following users have been cc'd because of the following labels:

cfallin: isle
fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

cfallin

Thanks! This generally looks good, except I think we need to resolve the pending (pre-existing) confusion below before we merge.

cfallin · 2022-08-17T23:51:28Z

cranelift/codegen/src/isa/x64/lower.isle

+
+;; TODO: The type we are expecting as input as actually an F64X2 but the
+;; instruction is only defined for integers so here we use I64X2. This is a
+;; separate issue that needs to be fixed in instruction.rs.


I know this is pre-existing but it seems to me we should clarify this before we merge; what exactly does this mean? Is there something in the Wasm translator that generates code with an explicitly incorrect type?

Likewise ignoring the second argument below seems outright wrong: if the semantics of snarrow are that it combines lanes of the two vectors, the lanes of b should show up somewhere.

I suspect this is a weird shoehorning of different semantics into a particular combo of ops (ie, the frontend generates exactly this shape, and the backend agrees that it means something different than what the actual instruction semantics, composed, would mean. We should fix this if so as it's a correctness bug waiting to happen!

A way to find out if this is the case is: what happens if we remove this specialization? We have generic lowerings for snarrow and fcvt_to_sint_sat; do those two lowerings, composed together, give a correct result for whatever code is meant to hit this rule?

cc @abrown @jlb6740 for more thoughts on this (you may remember more history here?)

I think this comment from lower.rs might shed some light: it seems like this combination of snarrow and fcvt_to_sint_sat is implementing the behavior of i32x4.trunc_sat_f64x2_s_zero:

wasmtime/cranelift/codegen/src/isa/x64/lower.rs

Lines 701 to 707 in 0a71df6

//y = i32x4.trunc_sat_f64x2_s_zero(x) is lowered to:

//MOVE xmm_tmp, xmm_x

//CMPEQPD xmm_tmp, xmm_x

//MOVE xmm_y, xmm_x

//ANDPS xmm_tmp, [wasm_f64x2_splat(2147483647.0)]

//MINPD xmm_y, xmm_tmp

//CVTTPD2DQ xmm_y, xmm_y

Here's where the instructions that this special case handles are introduced:
https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/wasm/src/code_translator.rs#L1789-L1795

The second argument to snarrow is always all zeros, so perhaps we could check that as well to better catch this case?

I still feel like I don't fully understand what's going on here; unfortunately that comment doesn't clarify much for me. Why is there a type mismatch if we're composing the float-to-int conversion (which produces an I64x2) with an int-to-int narrow? In other words what change is the TODO proposing in the instruction definitions?

I'm also curious whether removing this special case results in correct execution still; if not then that's a sign that we're trying to fit extra meaning into the combo here that shouldn't exist.

I think the comment that I migrated over on line 3282 is suggesting that the intent of this lowering is to lower the trunc_sat instance I mentioned above, thus the type mismatch as trunc_sat takes a floating point number and produces an integer. The comment from lower.rs backs this up by suggesting that this is the translation of i32x4.trunc_sat_f64x2_s_zero, and the translation of that operation produces the sequence that this rule matches.

I think that if we implement the signed and unsigned narrowings for I64X2 we could probably remove this special case and have the general case work, but just removing it now will cause test failures. It might be worth keeping the special case if it turns out that it avoids some unnecessary work, and i32x4.trunc_sat_f64x2_{s,u}_zero is a common enough operation.

I've clarified the purpose of the special case, and restricted it further to cases where the rhs of the snarrow is a zero vector which should match the code generated by the front end. I've also filed #4734 about the missing i64x2 cases for snarrow and unarrow.

cfallin

LGTM now, thanks!

elliottt added 5 commits August 16, 2022 12:43

Add tests for widening

bbc71d9

Lower widening operations in ISLE

5f6210b

Add a test for narrowing operations

4defa84

Migrate snarrow to ISLE

8331db0

Lower unarrow in ISLE

14de2b8

elliottt marked this pull request as ready for review August 16, 2022 20:24

elliottt added the isle Related to the ISLE domain-specific language label Aug 16, 2022

github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen labels Aug 16, 2022

elliottt mentioned this pull request Aug 17, 2022

x64: Lower bitcast, fabs, and fneg in ISLE #4729

Merged

cfallin reviewed Aug 17, 2022

View reviewed changes

Explain the snarrow special case and add an issue reference

869d459

elliottt requested a review from cfallin August 18, 2022 17:59

cfallin approved these changes Aug 18, 2022

View reviewed changes

elliottt merged commit 8b60199 into bytecodealliance:main Aug 18, 2022

alexcrichton mentioned this pull request Aug 19, 2022

Register allocation error for vcode: register allocation: EntryLivein #4736

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

x64: Lower widening and narrowing operations in ISLE #4722

x64: Lower widening and narrowing operations in ISLE #4722

elliottt commented Aug 16, 2022

github-actions bot commented Aug 16, 2022

cfallin left a comment

cfallin Aug 17, 2022

elliottt Aug 17, 2022 •

edited

Loading

elliottt Aug 18, 2022

cfallin Aug 18, 2022

elliottt Aug 18, 2022

elliottt Aug 18, 2022

cfallin left a comment

	//y = i32x4.trunc_sat_f64x2_s_zero(x) is lowered to:
	//MOVE xmm_tmp, xmm_x
	//CMPEQPD xmm_tmp, xmm_x
	//MOVE xmm_y, xmm_x
	//ANDPS xmm_tmp, [wasm_f64x2_splat(2147483647.0)]
	//MINPD xmm_y, xmm_tmp
	//CVTTPD2DQ xmm_y, xmm_y

x64: Lower widening and narrowing operations in ISLE #4722

x64: Lower widening and narrowing operations in ISLE #4722

Conversation

elliottt commented Aug 16, 2022

github-actions bot commented Aug 16, 2022

Subscribe to Label Action

cfallin left a comment

Choose a reason for hiding this comment

cfallin Aug 17, 2022

Choose a reason for hiding this comment

elliottt Aug 17, 2022 • edited Loading

Choose a reason for hiding this comment

elliottt Aug 18, 2022

Choose a reason for hiding this comment

cfallin Aug 18, 2022

Choose a reason for hiding this comment

elliottt Aug 18, 2022

Choose a reason for hiding this comment

elliottt Aug 18, 2022

Choose a reason for hiding this comment

cfallin left a comment

Choose a reason for hiding this comment

elliottt Aug 17, 2022 •

edited

Loading