-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intel architecture improvements for .NET 9 #93196
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsThis issue describes planned improvements to Intel architecture (x86, x64) ISA support for .NET 9. In .NET 8, AVX-512 ISA support was added (see #77034). In .NET 9, this support will be further improved and leveraged for improved performance, especially with expanded libraries utilization of the recently implemented AVX-512 support. Investigations and implementation will start to support the newly announced AVX10. Libraries work
RyuJIT feature work
RyuJIT feature work
RyuJIT optimization work
AVX10AVX10 is a new set of vector ISA extensions, described here. We expect to begin preliminary work to support AVX10 in .NET 9, at least the parts that most directly map to the already supported AVX-512.
CI/testing workDebugging / diagnostics work
API design work
|
Is there maybe any interest in adding the workaround for the JCC erratum (#35730) in .Net 9? I've seen minor codegen improvements be reported as huge regressions because the code started to hit this issue. |
@AndyAyersMS has expressed a desire to at least have a mode that could be used for performance testing to avoid the JCC erratum. Whether we could enable this always would depend on how uniform the improvements would be. It is expected there would be some code size regressions -- possibly significant -- due to the need to insert NOPs. |
Didn't we already accept that tradeoff with loop alignment? |
Yes, but this could be a very different magnitude of regression that will need to be measured. |
I went ahead and created #93243 related to adding a JIT mode to avoid the JCC erratum, and linked it here. |
I added Vector512 support for Min/Max of simple numeric datatypes in this PR: #93369 |
What about the upcoming APX extension? It looks like a major change of x86-64. I can see discussions around ABI for APX in GCC mail thread: https://gcc.gnu.org/pipermail/gcc/2023-July/242154.html https://gcc.gnu.org/pipermail/gcc-help/2023-August/142801.html Maybe it's too early for .NET to adopt APX, but I'd like to see the estimated timeline. Should we wait for MSVC to define the calling convention? |
We want to have hardware available on which it can run. While Intel hasn't given an official timeline as of yet, such hardware is most likely not in the .NET 9 lifetime which ships in November 2024 and will be out of support around May 2026. I expect this work will be done for .NET 10 which will likely ship around November 2025 (assuming we don't change our current pacing of releases) and be out of support November 2028. |
Would using Intel SDE not be enough for testing the support for it? It seems to already have support for emulating AVX10 and APX. |
There's no point in scheduling work to be done for hardware that doesn't exist yet, particularly if that hardware is unlikely to exist within the lifetime of a release. That is, we know that AVX10 is going to exist for Granite Rapids, as per the official announcement: https://www.intel.com/content/www/us/en/content-details/784267/intel-advanced-vector-extensions-10-intel-avx10-architecture-specification.html. The AVX10.1 work is correspondingly happening in .NET 9 While no official release date has been announced for APX, it is unlikely to happen in a timeframe that makes .NET 9 a good choice to target. |
Updated the Planned work with the current status. Marked completed work and moved items that will be pushed out to Future Work section. |
Moved JCC erratum and Vector work to future work becasue we don't have time to work on them in .NET 9. |
@tommcdon, Debugger team, mentioned: "The debugger has logic to decode x64 instructions for the purposes of setting breakpoints and determining if there are instruction-relative read/writes/jumps. Whenever the JIT generates new instruction encodings that our logic does not understand, we need to regenerate the decoder. Fortunately, we have a tool that can automatically generate the instruction decoder logic - runtime/src/coreclr/debug/ee/amd64/gen_amd64InstrDecode/README.md at main · dotnet/runtime (github.com). Work was done in .NET 8 to support AVX512 EVEX instruction encodings - Support breakpoints on AVX-512 instructions by BruceForstall · Pull Request #89705 · dotnet/runtime (github.com). It’s our current belief that the existing instruction decoder will correctly decode AVX10.1." |
This issue describes planned improvements to Intel architecture (x86, x64) ISA support for .NET 9.
In .NET 8, AVX-512 ISA support was added (see #77034). In .NET 9, this support will be further improved and leveraged for improved performance, especially with expanded libraries utilization of the recently implemented AVX-512 support. Investigations and implementation will start to support the newly announced AVX10.
Libraries work
System.Numerics.Tensors.TensorPrimitives
class #89639)AVX10
AVX10 is a new set of vector ISA extensions, described here. We expect to begin preliminary work to support AVX10 in .NET 9, at least the parts that most directly map to the already supported AVX-512. An
arch-avx10
GitHub label is defined to be added to all related PRs and issues: https://github.com/dotnet/runtime/labels/arch-avx10Avx10v1
to the runtime #99784AVX10
converged vector ISA #98069-- The current avx512 optimizations is working for avx10 targets
-- @tannergooding has identified a set of tests and @khushal1996 has successfully completed all stress tests getting expected results.
RyuJIT feature work
uint32
/uint64
to/from packedfloat
/double
#80829Vector128/256/512<T>
Finish Avx512 specific lightup for Vector128/256/512<T> #85207- All done except for Vector512.Dot, which will be pushed out to future item.
RyuJIT optimization work
float
/double
toulong
. #89279API design work
Future Work
Some of the planned work for .NET 9 have been pushed out to future work.
Libraries work
Ascii.Utility
methods withVector512
code paths. #89280StoreAligned
notStore
in WidenAsciiToUtf16 #89892AVX10
RyuJIT feature work
Vector512.Dot
: AVX-512 specific light-up forVector512.Dot
Finish Avx512 specific lightup for Vector128/256/512<T> #85207Vector<T>
Vector<T>
expanding toVector512<T>
, either automatically or opt-in. (@tannergooding plans to get back to it as a best effort.)JCC erratum
Debugging / diagnostics work (@BruceForstall)
The text was updated successfully, but these errors were encountered: