-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Use reaching definitions in CSE to update conservative VNs #109959
Conversation
Now that CSE always inserts into SSA we can update it to make use of the reaching definition information that it has access to. CSE already spent effort to track some extra information to try to do this, which we can remove. - Remove `optCSECheckedBoundMap`: this was used by CSE to try to update conservative VNs of ancestor bounds checks. This is unnececssary since all descendants of the CSE definitions should get the same conservative VNs automatically now. - Remove `CSEdsc::defConservNormVN`; this was used to update conservative VNs in the single-def case, which again is unnecessary now Making this change requires a bit of refactoring to the incremental SSA builder. Before this PR the builder takes all defs and all uses and then inserts everything into SSA. After this change the builder is used in a multi-step process as follows: 1. All definitions are added with `IncrementalSsaBuilder::InsertDef` 2. The definitions are finalized with `IncrementalSsaBuilder::FinalizeDefs` 3. Uses are inserted (one by one) with `IncrementalSsaBuilder::InsertUse`. No finalization are necessary; each use is directly put into SSA as a result of calling this method. The refactoring allows CSE to use the incremental SSA builder such that it can access reaching definition information for each of the uses as part of making replacements. However, this still requires some refactoring such that CSE performs replacements of all defs before performing the replacements of all uses. Additionally, this PR fixes various incorrect VN updating made by CSE. VN and CSE still track VNs that are interesting bounds check. However, VN was sometimes inserting VNs with exception sets into the set, which is not useful (the consumers always use normal VNs when querying the set). This PR fixes VN to insert the normal VN instead. Fix dotnet#109745
if (isSharedConst) | ||
// Assign the proper Value Numbers. | ||
ValueNumPair valExc = m_pCompiler->vnStore->VNPExceptionSet(val->gtVNPair); | ||
store->gtVNPair = m_pCompiler->vnStore->VNPWithExc(ValueNumStore::VNPForVoid(), valExc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was assigning ValueNumStore::VNPForVoid
before, without any exceptions. That seems wrong.
ValueNum lengthVN = | ||
vnStore->VNNormalValue(tree->AsBoundsChk()->GetArrayLength()->gtVNPair.GetConservative()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All consumers of this info query the normal conservative VN, so inserting one with exceptions is uninteresting. I hit a few diffs related to this.
/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress |
Azure Pipelines successfully started running 2 pipeline(s). |
/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress |
Azure Pipelines successfully started running 2 pipeline(s). |
/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress |
Azure Pipelines successfully started running 2 pipeline(s). |
cc @dotnet/jit-contrib PTAL @AndyAyersMS Diffs. Some minor CQ and TP improvements. The test failure looks like an infra issue -- the failing workitem was dead-lettered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you ever collect stats on how often we bail out putting defs into SSA because of IDF size?
Here's some histograms for IDF size we see in the incremental builder for win-x64 collections: aspnet
IDF size
<= 1 ===> 779 count ( 15% of total)
2 .. 2 ===> 780 count ( 30% of total)
3 .. 3 ===> 639 count ( 43% of total)
4 .. 4 ===> 608 count ( 55% of total)
5 .. 5 ===> 380 count ( 63% of total)
6 .. 10 ===> 1366 count ( 90% of total)
11 .. 20 ===> 386 count ( 97% of total)
21 .. 30 ===> 95 count ( 99% of total)
31 .. 40 ===> 9 count ( 99% of total)
41 .. 50 ===> 4 count (100% of total)
51 .. 100 ===> 0 count (100% of total)
101 .. 200 ===> 0 count (100% of total)
201 .. 300 ===> 0 count (100% of total)
301 .. 400 ===> 0 count (100% of total)
401 .. 500 ===> 0 count (100% of total)
benchmarks.run_pgo
IDF size
<= 1 ===> 298 count ( 12% of total)
2 .. 2 ===> 531 count ( 35% of total)
3 .. 3 ===> 200 count ( 43% of total)
4 .. 4 ===> 243 count ( 54% of total)
5 .. 5 ===> 170 count ( 61% of total)
6 .. 10 ===> 770 count ( 94% of total)
11 .. 20 ===> 114 count ( 99% of total)
21 .. 30 ===> 8 count ( 99% of total)
31 .. 40 ===> 0 count ( 99% of total)
41 .. 50 ===> 7 count ( 99% of total)
51 .. 100 ===> 1 count (100% of total)
101 .. 200 ===> 0 count (100% of total)
201 .. 300 ===> 0 count (100% of total)
301 .. 400 ===> 0 count (100% of total)
401 .. 500 ===> 0 count (100% of total)
libraries_tests.run:
IDF size
<= 1 ===> 4340 count ( 18% of total)
2 .. 2 ===> 3915 count ( 35% of total)
3 .. 3 ===> 3542 count ( 50% of total)
4 .. 4 ===> 3435 count ( 64% of total)
5 .. 5 ===> 1806 count ( 72% of total)
6 .. 10 ===> 5126 count ( 94% of total)
11 .. 20 ===> 1231 count ( 99% of total)
21 .. 30 ===> 151 count ( 99% of total)
31 .. 40 ===> 20 count ( 99% of total)
41 .. 50 ===> 5 count ( 99% of total)
51 .. 100 ===> 2 count (100% of total)
101 .. 200 ===> 0 count (100% of total)
201 .. 300 ===> 0 count (100% of total)
301 .. 400 ===> 0 count (100% of total)
401 .. 500 ===> 0 count (100% of total)
realworld:
IDF size
<= 1 ===> 449 count ( 22% of total)
2 .. 2 ===> 366 count ( 41% of total)
3 .. 3 ===> 341 count ( 58% of total)
4 .. 4 ===> 225 count ( 70% of total)
5 .. 5 ===> 178 count ( 79% of total)
6 .. 10 ===> 305 count ( 95% of total)
11 .. 20 ===> 81 count ( 99% of total)
21 .. 30 ===> 13 count ( 99% of total)
31 .. 40 ===> 1 count ( 99% of total)
41 .. 50 ===> 1 count (100% of total)
51 .. 100 ===> 0 count (100% of total)
101 .. 200 ===> 0 count (100% of total)
201 .. 300 ===> 0 count (100% of total)
301 .. 400 ===> 0 count (100% of total)
401 .. 500 ===> 0 count (100% of total) So looks like none of these collections have any examples that fail SSA insertion. |
Some examples of functions with large (>= 40) IDF sizes for some CSEs. https://github.com/PowerShell/PowerShell/blob/7ca7aae1d13d19e38c7c26260758f474cb9bef7f/src/System.Management.Automation/engine/Modules/ModuleCmdletBase.cs#L1513-L3580 (a 2000 line function) https://github.com/dotnet/performance/blob/4894b54b3e86637b040ba6c7c54c23524cfbeadd/src/benchmarks/micro/runtime/Bytemark/assign_rect.cs#L438-L540 (the OSR version of this) runtime/src/libraries/System.Private.CoreLib/src/System/Text/Unicode/Utf8Utility.Transcoding.cs Lines 835 to 1522 in 6717d71
This one has a CSE with 27 definitions and with no uses in reachable blocks. I wonder if it would be worth it to compute the DFS tree. It seems like the heuristic is not accounting for reachability of the uses (probably the better solution is to aggressively trim the blocks earlier) |
Now that CSE always inserts into SSA we can update it to make use of the reaching definition information that it has access to. CSE already spent effort to track some extra information to try to do this, which we can remove.
optCSECheckedBoundMap
: this was used by CSE to try to update conservative VNs of ancestor bounds checks. This is unnececssary since all descendants of the CSE definitions should get the same conservative VNs automatically now.CSEdsc::defConservNormVN
; this was used to update conservative VNs in the case where all defs agree on the conservative VN, which again is unnecessary nowMaking this change requires a bit of refactoring to the incremental SSA builder. Before this PR the builder takes all defs and all uses and then inserts everything into SSA. After this change the builder is used in a multi-step process as follows:
IncrementalSsaBuilder::InsertDef
IncrementalSsaBuilder::FinalizeDefs
IncrementalSsaBuilder::InsertUse
. No finalization is necessary; each use is directly put into SSA as a result of calling this method.The refactoring allows CSE to use the incremental SSA builder such that it can access reaching definition information for each of the uses as part of making replacements. However, this still requires some refactoring such that CSE performs replacements of all defs before performing the replacements of all uses.
Additionally, this PR fixes various incorrect VN updating made by CSE.
VN and CSE still track VNs that are interesting bounds check. However, VN was sometimes inserting VNs with exception sets into the set, which is not useful (the consumers always use normal VNs when querying the set). This PR fixes VN to insert the normal VN instead.
Fix #109745