-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experiment - Remove space restriction from XarchIntrinsicConstants #102913
Closed
Closed
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
372 changes: 277 additions & 95 deletions
372
src/coreclr/tools/Common/Compiler/HardwareIntrinsicHelpers.cs
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are bit-tested at runtime using compiler generated code. This is going to be quite complicated to do without regressing perf on 32bit x86.
I already wrote this elsewhere - I would like to get an understanding why we're wasting bits in this enum - a lot of the bits are redundant, e.g. XArchIntrinsicConstants_Avx2 and XArchIntrinsicConstants_VectorT256. Why do we need this duplication? Is this putting RyuJIT implementation details ("VectorT256" that is not an ISA extension as far as I know) into PAL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's 3 bits (VectorT128, VectorT256, VectorT512). I would not call it a lot. Is there anything else?
Right, we do not need the
VectorT*
bits in the minipal. They got there by sequence of copy&paste operations of code that was in CoreCLR JIT/EE interface originally.We do need the
VectorT*
bits on JIT/EE interface (that is different enum). The vector size is modelled as virtual ISA there.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the VL suffixes that seem to just be shortcuts for XArchIntrinsicConstants_Avx512f_vl plus something else that already has a bit.
The question is basically whether we can get away with 32 bits until we're ready to call Sse41 and friends "baseline". The underlying oses are starting to do that on both Windows and Linux side. Or we'll need to do the more complicated thing and extend this to arbitrary lengths like this PR is doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are not duplicate. The former describes support for
Avx2
the latter describes the size ofVector<T>
, these do not have to line up and it may be important to understand the encoded size ofVector<T>
for ABI purposes.There are some opportunities to fold together bits. For example,
AVX512*_VL
could just beAvx512vl
. Given that we requireAVX512F+BW+CD+DQ+VL
to be provided simultaneously, we could combine all of these into one as well.We could notably also not allow as fine-grained control as this currently is allowing. Rather than allowing individual opt-in to
SSE3, SSSE3, SSE4.1
, etc, we could force these to be present at the respective target baselines (i.e.x86-64-v2
,x86-64-v3
,x86-64-v4
) and only include the standalone ISAs that are independent of the primary baselinesHowever, all of that is less control than most other compilers toolchains give and we're likely to run out of bits sooner than we're able to meaningfully compress them and drop support for downlevel ISAs (although I am definitively still supportive of us moving towards an
x86-64-v2
baseline). So it's probably goodness that we take a look at this since we're already pushing up against the boundaries already.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are duplicate/undesirable at PAL level. PAL should be policy free. XArchIntrinsicConstants_... is PAL level enum.
They are not duplicate at the JIT/EE interface level where the Vector policy size policy lives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#102946
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we can use just a single bit for VL in this enum if we needed to free up more bits.