Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM64-SVE: Avoid containing non-embedded conditional select #105719 #105812

Merged
merged 8 commits into from
Aug 8, 2024

Conversation

a74nh
Copy link
Contributor

@a74nh a74nh commented Aug 1, 2024

Fixes #105719

CONDSELECT(TRUE, EMBBEDMASKOP(), 0)
For this scenario, during codegeneration, the SELECT will not be generated and instead just generate the embedded mask operation. To do this, op3 can be contained.

CONDSELECT(TRUE, VECTOR, 0)
For this senario, the SELECT operation will be generated (and then in emit a MOV will be generated instead). To do this, op3 cannot be contained and must be generated into a register.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Aug 1, 2024
@a74nh a74nh changed the title ARM64-SVE: Avoid containing non-embedded conditional select ARM64-SVE: Avoid containing non-embedded conditional select #105719 Aug 1, 2024
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Aug 1, 2024
@a74nh
Copy link
Contributor Author

a74nh commented Aug 1, 2024

@dotnet/arm64-contrib @kunalspathak

@a74nh a74nh marked this pull request as ready for review August 1, 2024 15:24
@a74nh
Copy link
Contributor Author

a74nh commented Aug 5, 2024

Added testing.

@jakobbotsch
Copy link
Member

cc @dotnet/jit-contrib

@TIHan @amanasifkhalid @tannergooding Can one of you review?

Copy link
Member

@amanasifkhalid amanasifkhalid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@amanasifkhalid
Copy link
Member

@a74nh Build Analysis is blocked by "experimental feature" warnings for the test you added; could you please add <NoWarn>SYSLIB5003</NoWarn> to the test's project file?

@jakobbotsch
Copy link
Member

@a74nh Build Analysis is blocked by "experimental feature" warnings for the test you added; could you please add <NoWarn>SYSLIB5003</NoWarn> to the test's project file?

(You may want to make it <NoWarn>$(NoWarn);SYSLIB5003</NoWarn> to inherit all the other ones.)

Comment on lines 3922 to 3928
if (op3->IsVectorZero() && op1->IsMaskAllBitsSet() && op2->IsEmbMaskOp())
{
// When we are merging with zero, we can specialize
// and avoid instantiating the vector constant.
// Do this only if op1 was AllTrueMask
// Do this only if op1 was AllTrueMask and op2 is embedded.
MakeSrcContained(node, op3);
}
Copy link
Member

@tannergooding tannergooding Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right fix. The zero (and in fact any constant or other value here) is still containable.

op1 is AllBitsSet so therefore nothing from op3 can ever be selected, so it is "unused". This means it is valid to drop the sel entirely and just emit op2 directly.

The only time we should be hitting this path is when op2 is an operation that requires a mask to be specified (even if unused) or some manually written user code in minopts. For the latter, it's still fine to contain the zero constant and just use the op2 register for both inputs (containment is a basic operation that happens at all codegen levels).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

op1 is AllBitsSet so therefore nothing from op3 can ever be selected, so it is "unused". This means it is valid to drop the sel entirely and just emit op2 directly.

Updated to do this instead.

Required the embedded HWIntrinsic check, otherwise lowering goes into an infinite loop adding and removing conditional selects.

@@ -15,6 +15,7 @@ pr:
- src/coreclr/jit/emitarm64sve.cpp
- src/coreclr/jit/emitfmtsarm64sve.h
- src/coreclr/jit/lsraarm64.cpp
- src/coreclr/jit/lowerarmarch.cpp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also add entries for following?

  • lsraarmarch.cpp
  • codegenarmarch.cpp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
Also alphabetically ordered the list.

@kunalspathak kunalspathak added the arm-sve Work related to arm64 SVE/SVE2 support label Aug 8, 2024
Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@amanasifkhalid amanasifkhalid merged commit da757a1 into dotnet:main Aug 8, 2024
113 of 116 checks passed
@a74nh a74nh deleted the selreg4_github branch August 8, 2024 15:14
@github-actions github-actions bot locked and limited conversation to collaborators Sep 8, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI arm-sve Work related to arm64 SVE/SVE2 support community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JIT SVE: Assertion failed 'isVectorRegister(reg4)' during 'Generate code'
5 participants