Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arm64/SVE: JitStress/JitStressRegs fixes #102543

Merged
merged 17 commits into from
May 22, 2024
Merged

Conversation

kunalspathak
Copy link
Member

  • For FMA, handle case for FMA where falseReg == embMaskOp1Reg
  • Workaround around mov when used as alias for sel because predicateRegister/vectorRegister are same
  • When intrinsic is wrapped in ConditionalSelect, check the intrinsic flag if it needs low register mask
  • Various APIs had missing flag for HW_Flag_LowMaskedOperation

There are still some functional failures with JitStress/JitStressRegs, but wanted to send this out.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 22, 2024
@kunalspathak
Copy link
Member Author

@dotnet/arm64-contrib

@kunalspathak kunalspathak requested a review from TIHan May 22, 2024 05:55
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

if falseReg == embMaskOp2Reg, we simply generate:

```
            sel     z16.s, p7, z9.s, z10.s
            mla     z16.s, p7/m, z10.s, z11.s
```

Here `z10` holds `falseReg` and `embMaskOp2Reg`.
@kunalspathak kunalspathak added the arm-sve Work related to arm64 SVE/SVE2 support label May 22, 2024
Copy link
Contributor

@a74nh a74nh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now

Copy link
Contributor

@TIHan TIHan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Glad we are catching these.

@kunalspathak kunalspathak deleted the sve-fixes branch May 22, 2024 20:51
steveharter pushed a commit to steveharter/runtime that referenced this pull request May 28, 2024
* handle case for FMA where falseReg == embMaskOp1Reg

* workaround because predicateRegister/vectorRegister are same

* When intrinsic is wrapped in ConditionalSelect, make sure to check its LOW_PREDICATE flag

* Mark AddAcross with HW_Flag_LowMaskedOperation

* double check if ConditionalSelect's op2 is hwintrinsic

* Mark Max with HW_Flag_LowMaskedOperation

* Mark MaxAcross with HW_Flag_LowMaskedOperation

* Mark MinNumber/MaxNumber/MinNumberAcross/MaxNumberAcross with HW_Flag_LowMaskedOperation

* Mark Min/MinAcross with HW_Flag_LowMaskedOperation

* fix the workaround for predicate vs. vector register

* fix the predicate mask preference

* Introduce INS_SCALABLE_OPTS_PREDICATE_MERGE_MOV

* jit format

* revert change to csproj

* remove assert that can not happen for FMA

if falseReg == embMaskOp2Reg, we simply generate:

```
            sel     z16.s, p7, z9.s, z10.s
            mla     z16.s, p7/m, z10.s, z11.s
```

Here `z10` holds `falseReg` and `embMaskOp2Reg`.

* revert a condition added for workaround of predicate == vector register

* remove the extra comment
Ruihan-Yin pushed a commit to Ruihan-Yin/runtime that referenced this pull request May 30, 2024
* handle case for FMA where falseReg == embMaskOp1Reg

* workaround because predicateRegister/vectorRegister are same

* When intrinsic is wrapped in ConditionalSelect, make sure to check its LOW_PREDICATE flag

* Mark AddAcross with HW_Flag_LowMaskedOperation

* double check if ConditionalSelect's op2 is hwintrinsic

* Mark Max with HW_Flag_LowMaskedOperation

* Mark MaxAcross with HW_Flag_LowMaskedOperation

* Mark MinNumber/MaxNumber/MinNumberAcross/MaxNumberAcross with HW_Flag_LowMaskedOperation

* Mark Min/MinAcross with HW_Flag_LowMaskedOperation

* fix the workaround for predicate vs. vector register

* fix the predicate mask preference

* Introduce INS_SCALABLE_OPTS_PREDICATE_MERGE_MOV

* jit format

* revert change to csproj

* remove assert that can not happen for FMA

if falseReg == embMaskOp2Reg, we simply generate:

```
            sel     z16.s, p7, z9.s, z10.s
            mla     z16.s, p7/m, z10.s, z11.s
```

Here `z10` holds `falseReg` and `embMaskOp2Reg`.

* revert a condition added for workaround of predicate == vector register

* remove the extra comment
@github-actions github-actions bot locked and limited conversation to collaborators Jun 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI arm-sve Work related to arm64 SVE/SVE2 support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants