#78303 Add transformation ~v1 & v2 to VectorXxx.AndNot(v1, v2) #81993

SkiFoD · 2023-02-11T08:04:02Z

I created a draft for the issue #78303
The code converts ~v1 & v2 to VectorXxx.AndNot(v1, v2)

ghost · 2023-02-11T08:04:16Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

I created a draft for the issue.
The code converts ~v1 & v2 to VectorXxx.AndNot(v1, v2)

Author:	SkiFoD
Assignees:	-
Labels:	`area-CodeGen-coreclr`, `community-contribution`
Milestone:	-

src/coreclr/jit/morph.cpp

EgorBo · 2023-02-21T11:07:41Z

src/coreclr/jit/morph.cpp

+        case NI_SSE_And:
+        case NI_SSE2_And:
+        case NI_AVX_And:
+        case NI_AVX2_And:


What about Vector128/256_And and AdvSimd ?

Vector64/128/256_And don't exist outside of import at the moment so they don't need to be handled.

AdvSimd should be since we want parity between xarch and arm.

EgorBo · 2023-02-21T11:07:54Z

src/coreclr/jit/morph.cpp

+        case NI_AVX_And:
+        case NI_AVX2_And:
+        {
+            if (node->GetOperandCount() != 2)


when exactly it might be not 2 ?

Shouldn't ever not be 2. If it was, we'd have a buggy node.

Use an assert then instead?

tannergooding · 2023-02-21T15:37:13Z

src/coreclr/jit/morph.cpp

+            if (op1->OperIs(GT_HWINTRINSIC))
+            {
+                rhs      = op2;
+                inner_hw = op1->AsHWIntrinsic();
+            }
+            // Transforms v2 & (~v1) to VectorXxx.AndNot(v1, v2)
+            else if (op2->OperIs(GT_HWINTRINSIC))
+            {
+                rhs      = op1;
+                inner_hw = op2->AsHWIntrinsic();
+            }
+            else
+            {
+                return node;
+            }


This is going to miss the optimization for cases like: ((x & y) & ~z)

You're going to need to check that it is a hwintrinsic and that it is the relevant xor (xarch and arm) or not (arm only) node.

There is also potentially a concern around side effects and ensuring that ~x & y, which must be represented as gtNewSimdBinOpNode(AND_NOT, y, x, ...) preserves side effects with regards to x being evaluted before y.

I have resolved some comments and pushed them to make sure I got you right.
Could you please give me a hint how to treat not for arm and how to test it and how to handle the sideeffect case?

tannergooding · 2023-02-21T15:38:32Z

src/coreclr/jit/morph.cpp

+            if ((inner_hw->GetOperandCount() != 2) || (!inner_hw->Op(2)->IsVectorAllBitsSet()))
+            {
+                return node;
+            }


Would be better to check this as part of handling _Xor below, that way you don't need to check the operand count and its easier for the general logic to support AdvSimd_Not on Arm64.

tannergooding · 2023-02-21T15:39:29Z

src/coreclr/jit/morph.cpp

+            var_types    hw_type     = node->TypeGet();
+            CorInfoType  hw_coretype = node->GetSimdBaseJitType();
+            unsigned int hw_simdsize = node->GetSimdSize();


We refer to these as just simdType, simdBaseJitType, and simdSize almost everywhere else in the JIT.

tannergooding · 2023-03-06T15:37:57Z

src/coreclr/jit/morph.cpp

+                GenTreeHWIntrinsic* xor_hw = op1->AsHWIntrinsic();
+                switch (xor_hw->GetHWIntrinsicId())
+                {
+#if defined(TARGET_XARCH) || defined(TARGET_ARM64)


This is unnecessary, you're already in a larger identical ifdef from L10873.

That being said, the larger identical ifdef on L10873 should also be unnecessary given we're in a greater #ifdef FEATURE_HW_INTRINSICS

tannergooding · 2023-03-06T16:17:06Z

src/coreclr/jit/morph.cpp

+            }
+
+            // Transforms v2 & (~v1) to VectorXxx.AndNot(v2, v1)
+            if (op2->OperIs(GT_HWINTRINSIC))


This check is going to miss the opt if we have something like ((x ^ AllBitsSet) & (y ^ z). Such a tree could have been transformed into AndNot((y ^ z), x)

In general you're going to need to match (op1 ^ AllBitsSet) up front before determining if its a match and then if that fails do the same check for (op2 ^ AllBitsSet).

For Arm64, you'll also need to directly check for ~op1 or ~op2 (since NI_AdvSimd_Not exists).

There are some things we could do to make this overall simpler, but they are slightly more involved changes.

I'd, in general, recommend extracting some of this to a helper.

For example, you could define something like:

genTreeOps GenTreeHWIntrinsic::HWOperGet() { switch (GetHWIntrinsicId()) { #if defined(TARGET_XARCH) case NI_SSE_And: case NI_SSE2_And: case NI_AVX_And: case NI_AVX2_And: #elif defined(TARGET_ARM64) case NI_AdvSimd_And: #endif { return GT_AND; } #if defined(TARGET_ARM64) case NI_AdvSimd_Not: { return GT_NOT; } #endif #if defined(TARGET_XARCH) case NI_SSE_Xor: case NI_SSE2_Xor: case NI_AVX_Xor: case NI_AVX2_Xor: #elif defined(TARGET_ARM64) case NI_AdvSimd_Xor: #endif { return GT_XOR; } // TODO: Handle other cases default: { return GT_NONE; } } }

Such a helper allows you to instead switch over the genTreeOps equivalent. So you could have something like:

switch (node->HWOperGet()) { case GT_AND: { GenTree* op1 = node->Op(1); GenTree* op2 = node->Op(2); GenTree* lhs = nullptr; GenTree* rhs = nullptr; if (op1->OperIsHWIntrinsic()) { // Try handle: ~op1 & op2 GenTreeHWIntrinsic* hw = op1->AsHWIntrinsic(); genTreeOps hwOper = hw->HWOperGet(); if (hwOper == GT_NOT) { lhs = op2; rhs = op1; } else if (op1Oper == GT_XOR) { GenTree* hwOp1 = hw->Op(1); GenTree* hwOp2 = hw->Op(2); if (hwOp1->IsVectorAllBitsSet()) { lhs = op2; rhs = hwOp2; } else if (hwOp2->IsVectorAllBitsSet()) { lhs = op2; rhs = hwOp1; } } } if ((lhs == nullptr) && op2->OperIsHWIntrinsic()) { // Try handle: op1 & ~op2 GenTreeHWIntrinsic* hw = op2->AsHWIntrinsic(); genTreeOps hwOper = hw->HWOperGet(); if (hwOper == GT_NOT) { lhs = op1; rhs = op2; } else if (op1Oper == GT_XOR) { GenTree* hwOp1 = hw->Op(1); GenTree* hwOp2 = hw->Op(2); if (hwOp1->IsVectorAllBitsSet()) { lhs = op1; rhs = hwOp2; } else if (hwOp2->IsVectorAllBitsSet()) { lhs = op1; rhs = hwOp1; } } } if (lhs == nullptr) { break; } GenTree* andnNode = gtNewSimdBinOpNode(GT_AND_NOT, simdType, lhs, rhs, simdBaseJitType, simdSize, true); DEBUG_DESTROY_NODE(node); INDEBUG(andnNode->gtDebugFlags |= GTF_DEBUG_NODE_MORPHED); return andnNode; } default: { break; } }

You could of course also extract the NOT op vs op XOR AllBitsSet matching logic to reduce duplication as well.

Longer term, I think we may want to introduce a "fake" Isa_Not hwintrinsic id for xarch. That would allow morph to transform x ^ AllBitsSet into Isa_Not and then would in turn allow this case to be simplified in its pattern checks.

We may also want to normalize cases like Sse_Xor, Sse2_Xor, and AdvSimd_Xor into Vector128_Xor, so we don't need to consider xplat differences. But that will also involve significant refactorings, far more so than introducing a HWOperGet helper for the time being.

EgorBo · 2023-03-09T13:18:34Z

src/coreclr/jit/gentree.cpp

+
+genTreeOps GenTreeHWIntrinsic::HWOperGet()
+{
+    switch (GetHWIntrinsicId())


@tannergooding do you think we can then (not necessarily in this PR) add this to the table where intrinsics are defined?

We should be able to do so, yes.

However, I rather think we'd want to represent it just a little bit differently to avoid bloating the metadata tables given most intrinsics end up as none.

EgorBo · 2023-03-09T13:21:14Z

src/coreclr/jit/morph.cpp

+
+    // Transforms:
+    // 1.(~v1 & v2) to VectorXxx.AndNot(v1, v2)
+    // 2.(v1 & (~v2)) to VectorXxx.AndNot(v1, v2)


did you mean (v2, v1) here?

EgorBo

LGTM, thanks!

src/coreclr/jit/gentree.cpp

JulieLeeMSFT · 2023-08-07T16:02:25Z

@SkiFoD, please resolve the merge conflict. .NET 8 rc1 snap is 8/14.

SkiFoD · 2023-08-14T16:42:09Z

@SkiFoD, please resolve the merge conflict. .NET 8 rc1 snap is 8/14.

OK, I will look at it as soon as possible.

EgorBo · 2023-09-04T05:50:56Z

@SkiFoD thanks! sorry for the delay, there was also a small issue in the codegen that I fixed

The new logic introduced in dotnet#81993 would swap the LHS and RHS of the operands without any additional checks for side effects. Fix dotnet#91855

…91882) The new logic introduced in #81993 would swap the LHS and RHS of the operands without any additional checks for side effects. Fix #91855

dotnet#78303 Add transformation ~v1 & v2 to VectorXxx.AndNot(v1, v2)

4f32df9

ghost added the community-contribution Indicates that the PR has been added by a community member label Feb 11, 2023

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 11, 2023

EgorBo reviewed Feb 11, 2023

View reviewed changes

src/coreclr/jit/morph.cpp Outdated Show resolved Hide resolved

EgorBo reviewed Feb 11, 2023

View reviewed changes

src/coreclr/jit/morph.cpp Outdated Show resolved Hide resolved

EgorBo reviewed Feb 11, 2023

View reviewed changes

src/coreclr/jit/morph.cpp Outdated Show resolved Hide resolved

tannergooding reviewed Feb 11, 2023

View reviewed changes

src/coreclr/jit/morph.cpp Outdated Show resolved Hide resolved

SkiFoD added 3 commits February 19, 2023 18:52

dotnet#78303 Moved code under fgOptimizeHWIntrinsic

66b4ab0

Merge branch 'main' into skiod/issue-78303

28b09d4

dotnet#78303 Fix for merge

a1ae670

EgorBo reviewed Feb 21, 2023

View reviewed changes

tannergooding reviewed Feb 21, 2023

View reviewed changes

SkiFoD added 3 commits February 28, 2023 09:02

dotnet#78303 Resolved some comments

2f0e1a8

dotnet#78303 Resolved some comments

56e1bea

dotnet#78303 Build fix. Insert 'break;' to avoid fall-through

94f1ba7

tannergooding reviewed Mar 6, 2023

View reviewed changes

dotnet#78303 Changed the checks for the cases

a5a1b44

SkiFoD marked this pull request as ready for review March 8, 2023 06:08

SkiFoD changed the title ~~#78303 Draft: Add transformation ~v1 & v2 to VectorXxx.AndNot(v1, v2)~~ #78303 Add transformation ~v1 & v2 to VectorXxx.AndNot(v1, v2) Mar 8, 2023

EgorBo reviewed Mar 9, 2023

View reviewed changes

dotnet#78303 Comment fix

c24f97b

EgorBo approved these changes Mar 27, 2023

View reviewed changes

EgorBo reviewed Mar 27, 2023

View reviewed changes

src/coreclr/jit/gentree.cpp Show resolved Hide resolved

Merge branch 'main' into skiod/issue-78303

13835fb

Merge branch 'main' of github.com:dotnet/runtime into skiod/issue-78303

afc4cfc

Merge branch 'main' of github.com:dotnet/runtime into skiod/issue-78303

3e8805f

JulieLeeMSFT added this to the 9.0.0 milestone Aug 14, 2023

JulieLeeMSFT assigned EgorBo and SkiFoD Aug 14, 2023

SkiFoD added 2 commits August 14, 2023 21:10

Merge branch 'main' into skiod/issue-78303

5318946

dotnet#78303 Fix of formatting

bec1de3

SkiFoD force-pushed the skiod/issue-78303 branch from c68ba6f to bec1de3 Compare August 14, 2023 19:24

build-analysis bot mentioned this pull request Aug 15, 2023

BinderTracingTest_ResolutionFlow test fails #90593

Closed

EgorBo added 3 commits August 28, 2023 02:39

Merge branch 'main' of github.com:dotnet/runtime into SKiod/issue-78303

dc6b52e

Merge branch 'main' of github.com:dotnet/runtime into sKiod/issue-78303

f8f3912

Fix bug

b34199e

build-analysis bot mentioned this pull request Sep 3, 2023

Microsoft.NET.HostModel.Tests failing with "No space left on device" #91039

Closed

Fix another bug

1ae64af

EgorBo merged commit 13a225f into dotnet:main Sep 4, 2023
127 checks passed

jakobbotsch mentioned this pull request Sep 11, 2023

Assertion failed '!"Write to unaliased local overlaps outstanding read"' during 'Rationalize IR' #91855

Closed

jakobbotsch mentioned this pull request Sep 11, 2023

JIT: Fix illegal operand swapping in VectorXYZ.AndNot transformation #91882

Merged

ghost locked as resolved and limited conversation to collaborators Oct 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#78303 Add transformation ~v1 & v2 to VectorXxx.AndNot(v1, v2) #81993

#78303 Add transformation ~v1 & v2 to VectorXxx.AndNot(v1, v2) #81993

SkiFoD commented Feb 11, 2023 •

edited

Loading

ghost commented Feb 11, 2023

EgorBo Feb 21, 2023

tannergooding Feb 21, 2023

EgorBo Feb 21, 2023

tannergooding Feb 21, 2023

MichalPetryka Feb 21, 2023

tannergooding Feb 21, 2023

tannergooding Feb 21, 2023

SkiFoD Mar 1, 2023

tannergooding Feb 21, 2023

tannergooding Feb 21, 2023

tannergooding Mar 6, 2023

tannergooding Mar 6, 2023

tannergooding Mar 6, 2023 •

edited

Loading

tannergooding Mar 6, 2023

EgorBo Mar 9, 2023

tannergooding Mar 9, 2023

EgorBo Mar 9, 2023

EgorBo left a comment

JulieLeeMSFT commented Aug 7, 2023

SkiFoD commented Aug 14, 2023

EgorBo commented Sep 4, 2023

#78303 Add transformation ~v1 & v2 to VectorXxx.AndNot(v1, v2) #81993

#78303 Add transformation ~v1 & v2 to VectorXxx.AndNot(v1, v2) #81993

Conversation

SkiFoD commented Feb 11, 2023 • edited Loading

ghost commented Feb 11, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tannergooding Mar 6, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

EgorBo left a comment

Choose a reason for hiding this comment

JulieLeeMSFT commented Aug 7, 2023

SkiFoD commented Aug 14, 2023

EgorBo commented Sep 4, 2023

SkiFoD commented Feb 11, 2023 •

edited

Loading

tannergooding Mar 6, 2023 •

edited

Loading