-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for AvxVnni instructions under Experimental. #51998
Conversation
Note regarding the This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change. |
Tagging subscribers to this area: @tannergooding Issue DetailsThis is for #43780.
|
CC. @echesakovMSFT for the HWIntrinsics side |
src/coreclr/jit/emitxarch.cpp
Outdated
@@ -15391,6 +15401,10 @@ emitter::insExecutionCharacteristics emitter::getInsExecutionCharacteristics(ins | |||
case INS_vfnmsub132ss: | |||
case INS_vfnmsub213ss: | |||
case INS_vfnmsub231ss: | |||
case INS_vpdpbusd: //will be populated when the HW becomes publicly available |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get a small issue tracking this to ensure it doesn't get lost/forgotten?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want me to open an issue now, or at the time when the code could be merged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either is fine, just as long as we are tracking updating it once the official numbers become available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tannergooding, #52121 is tracking this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
node->SetRegNum(op1Reg); | ||
targetReg = op1Reg; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We aren't doing this for any other HWIntrinsic instructions, instead we generate a movaps
as required to put op1Reg
in targetReg
.
I'd expect register allocation to have already done everything at this point, but I'm not an expert here. @kunalspathak ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kunalspathak Does this change looks good to you or could you give me some suggestion on how should we handle it? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @tannergooding . While I have limited knowledge, but I can see that we generate movaps
if targetReg != op1Reg
going forward in genHWIntrinsic_R_R_R_RM()
. Is there a situation where that doesn't happen and we have to do targetReg = opt1Reg
explicitely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I double checked this part of the code and agree that we don't need to do this. I will update the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I was checking this part, I encountered some issue when I set COMPlus_jitDump=RunBasicScenario_Load. The error message I saw is as the following (same error also found in other API such as the ones in Fma_vector256):
Assert failure(PID 5076 [0x000013d4], Thread: 26664 [0x6828]): Assertion failed 'type < CORINFO_TYPE_COUNT' in 'JIT.HardwareIntrinsics.X86.SimpleTernaryOpTest__MultiplyAddDouble:RunBasicScenario_Load():this' during 'Morph - Global' (IL size 137)
File: ...\runtime\src\coreclr\jit\ee_il_dll.hpp Line: 273
Image: ...\runtime\artifacts\bin\coreclr\windows.x64.Debug\corerun.exe
@tannergooding, do you know what is happening here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd need to see the full stack trace, but I suspect this would be fixed by merging with (or rebasing onto) the latest dotnet/main.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...raries/System.Private.CoreLib/src/System/Runtime/Intrinsics/X86/Avx2.PlatformNotSupported.cs
Outdated
Show resolved
Hide resolved
Do we consider https://github.com/dotnet/designs/blob/main/accepted/2021/preview-features/preview-features.md to be approved? Should we start using here instead of the |
...ies/System.Runtime.Intrinsics.Experimental/ref/System.Runtime.Intrinsics.Experimental.csproj
Outdated
Show resolved
Hide resolved
@terrajobst, what is your preference here? Do we want to ship these in |
src/tests/JIT/HardwareIntrinsics/X86/AvxVnni/MultiplyWideningAndAdd.Int16.cs
Show resolved
Hide resolved
node->SetRegNum(op1Reg); | ||
targetReg = op1Reg; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @tannergooding . While I have limited knowledge, but I can see that we generate movaps
if targetReg != op1Reg
going forward in genHWIntrinsic_R_R_R_RM()
. Is there a situation where that doesn't happen and we have to do targetReg = opt1Reg
explicitely?
In the past, we have avoided changing API names between preview and RTM, we preferred packages. The reason being that having to change an API name guarantees a breaking change for everyone that used the feature, as opposed to only break the people who happen to depend on the parts that we changed based on feedback, which is usually only a fraction of all consumers. Using packages doesn't work well for cases where the API practically needs to live in existing assemblies or on existing types, which is precisely what the Preview design is meant to address. So yes, I'm in favor of doing that instead of adding an |
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/X86/AvxVnni.cs
Show resolved
Hide resolved
src/libraries/System.Runtime.Intrinsics.Experimental/System.Runtime.Intrinsics.Experimental.sln
Outdated
Show resolved
Hide resolved
|
||
/// <summary> | ||
/// __m128i _mm_dpbusd_epi32 (__m128i src, __m128i a, __m128i b) | ||
/// VPDPBUSD xmm, xmm, xmm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Based on the JIT support, this (and the other APIs) should be:
/// VPDPBUSD xmm, xmm, xmm | |
/// VPDPBUSD xmm, xmm, xmm/m128 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
andusing ymm/m256
for the Vector256<T>
APIs
@jkotas, @davidwrighton do the corinfoinstructionset changes require the The PR otherwise looks correct to me at this point. |
Yes. |
@weilinwa, as per Jan, the JITEEVersionIdentifier needs to be updated as part of this. That basically just involves updating this constant using For the current CI failures, it looks like the following changes are needed:
|
@tannergooding, thanks for those very helpful info! I've maid changes accordingly. Could you take a look at the new changes? |
@echesakovMSFT, @dotnet/jit-contrib this should have another pair of eyes before it is merged. |
Sure, I will try to take a look tomorrow or earlier next week |
The Mono llvmaot lanes are still failing. :( Go ahead and disable the tests via issues.targets; I'll create an issue to follow up and fix this within Mono. |
Thanks @imhameed. @weilinwa, disabling the tests should just require adding the following lines to this <ExcludeList Include="$(XunitTestBinBase)/JIT/HardwareIntrinsics/X86/AvxVnni/*">
<Issue>Mono crashes when new unsupported intrinsic groups are added, https://github.com/dotnet/runtime/issues/53078</Issue>
</ExcludeList> |
@tannergooding and @imhameed , I've added the ItemGroup to disable the tests but still see failures. Could you please take a look and let me know if I'm not disable them correctly or is there anything else causing the problem? Thanks. |
Use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
case NI_AVXVNNI_MultiplyWideningAndAdd: | ||
case NI_AVXVNNI_MultiplyWideningAndAddSaturate: | ||
{ | ||
assert(targetReg != REG_NA); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These assertions are not really needed here - we would check the same invariants at the beginning of genHWIntrinsic_R_R_R_RM
Hello @tannergooding! Because this pull request has the p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (
|
Thanks for all the work here @weilinwa. I've resolved the merge conflict in the JITEEVersionGUID and have marked this for auto-merge once CI passes. We do still need a tracking issue for #51998 (comment) |
Thanks @tannergooding and everybody for all the help. #52121 is the tracking issue we need. |
@tannergooding, the CI still failed in some Mono related tests :( . Besides that, it also failed in the System.Security.Cryptography.Xml.Tests. Could you please help me to find solutions to these problems? Thanks! |
I've requeued the failing jobs to see if they'll pass now. They looked like flaky tests so hopefully they pass this time. If they do pass, that's great and if not we may be able to merge anyways if there are existing issues tracking them. |
This is for #43780.