-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
COMPlus_EnableHWIntrinsic=0 no longer disables SSE+ #35605
Comments
I couldn't figure out the best area label to add to this issue. Please help me learn by adding exactly one area label. |
CC. @davidwrighton, @CarolEidt, @echesakovMSFT |
If everyone is fine with the suggested approach, I'm happy to put up a PR. To reiterate, the suggested approach is:
|
The approach seems reasonable, though with #35421 it would be good to understand (and document) how the two features (SIMD and HWIntrinsic) cooperate. |
This might require having a clear understanding of what all What I would, personally, like to see is that:
For For For the flag controlling the size of |
I believe we have way too many knobs here, and these flags are far to tied to jit internal behaviors, and don't work well in the presence of AOT generated code. Do we have any evidence for what combinations of flags are actually used in practice? How do we intend for these flags to be used? And if we intend for customers to use them not just runtime developers, these flags need explicit testing. |
None of the JIT configuration knobs really work "well" with AOT generated code as any code which has been AOT'd will completely ignore them until it is rejitted.
I don't know if we have any metrics besides our own testing + ML.NET
The if (Sse41.IsSupported)
{
}
else if (Sse3.IsSupported)
{
}
else if (Sse.IsSupported)
{
}
else
{
} You can define a CI job with each lower ISA disabled and ultimately have tested all configurations and the software fallback as we do in our outerloop and as ML.NET does, etc.
Do we have a proposal for how to test this? The runtime in general isn't very unit testable and we are generally stuck with inserting some asserts that things "look correct". |
I expect that testing could work based on looking at the results of the IsSupported flags values, and making sure that the appropriate IsSupported managed apis were enabled/disabled, the IsHardwareAccelerate api would return true/false. Effectively, I would like a set of tests that would cover the IsSupported flags, and ensure that they behave as expected. Actual generation of native instructions is less of a concern from my POV. In addition, eventually I'm going to get us to support the native calling conventions for the vector types. At that point, the exact set of supported instruction sets will become an issue that not only effects codegen, but also has implications for the runtime, just like the current logic on |
Whatever approach, unit testing needs the environment variables to force execution down each path. Currently we don't test all paths - and as we add more ARM paths, the software path will be less tested as well. There is an issue to track doing this better: #950 |
This is pretty fundamental to the way we have exposed the ISAs for hardware intrinsics. Since we don't explicitly test the full range of actual hardware supported, we use these options as an alternative, though note that we've had encoding and other issues in the past that slipped by our tests. We have a set of outerloop tests that run with the full complement of ISA options - though they don't combine those with the range of other outerloop testing we do (e.g. GC stress, jitStressRegs, etc.) |
Ok, let me revise my statement. I believe we have way too many knobs that are easy to misunderstand the consequences of adjusting. There are the knobs attached to instruction set, which are now handled consistently in the JIT, VM, and AOT compiler such that if there are disagreements amongst the various components the JIT runtime behavior will win out if the method is jitted. (It will be ignored for methods that have been precompiled via AOT until tiered compilation kicks in). If we move the consequences of the Enable{Isa} switches to be visible at the runtime level as well as the JIT it will work with crossgen2 to produce a consistent meaning. There is the feature simd stuff, which I don't believe actually works in a consistent fashion today, which I thought was primarily designed to disable the Vector128/256 intrinsics. And finally there is the EnableHWIntrinsic switch which is the focus of this discussion. As its described in terms of GenTree nodes, I have no idea what its supposed to mean as I'm not familiar with what it means to make one type of GenTree vs another. If we add a supported switch for adjusting the |
|
I wonder if we couldn't tweak this to be all expressed in terms of access to the various InstructionSets, and then it becomes somewhat simpler to model across the broad spectrum of the system. For instance, Then we could have one model of communication across the system, the instruction sets, and yet still have the somewhat simplified controls of EnableHWIntrinsics and VectorT size control. |
That's what we had in 3.0; however this means disabling HWIntrinsics impacts the size of |
Ah, makes sense. Well, if its necessary, its necessary. What I ask though, is that we have to consider the consequences of AOT generated code. As of my somewhat recent change, the compiler can and will generate logic which depends on these flags, and we need to build the appropriate mechanisms to disallow usage of semantically wrong code. |
This was fixed back in #37882 where the SimdAsHWIntrinsic logic was updated to additionally check |
In .NET Core 3.0/3.1 setting
COMPlus_EnableHWIntrinsic=0
would disable all HWIntrinsics. That is, it would mark the shared helper intrinsics (Vector64
,Vector128
, andVector256
) as unsupported and would also cause any platform specific intrinsics (SSE+
on x86) to be unsupported.However, in the current master branch, setting this only disables the shared helper intrinsics. We should likely clean this up so that the previous behavior stays the same.
That being said, it might be beneficial to change how it was functioning as compared to .NET Core 3.1 (I had logged a bug for this a while back: #11701). That is, rather than having
EnableHWIntrinsic
also having the compiler report that the compiler doesn't support SSE+, we should instead just have it only impact the creation ofHWIntrinsic
nodes. We could do this via a similar mechanism toFeatureSIMD
which currently has abool featureSIMD
field and which uses that to early exit from theimpSIMDIntrinsic
code paths and we could additionally add it as an assert togtNewSimdHWIntrinsic
andgtNewScalarHWIntrinsic
methods.This would allow the compiler to continue reporting and keying off of what ISAs the hardware supports regardless of whether the user has HWIntrinsics enabled/disabled.
category:testing
theme:intrinsics
skill-level:intermediate
cost:medium
The text was updated successfully, but these errors were encountered: