-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: track memory loop dependence of trees during value numbering #55936
JIT: track memory loop dependence of trees during value numbering #55936
Conversation
Leverage value numbering's alias analysis to annotate trees with the loop memory dependence of the tree's value number. First, refactor the `mapStore` value number so that it also tracks the loop number where the store occurs. This is done via an extra non-value-num arg, so add appropriate bypasses to logic in the jit that expect to only find value number args. Also update the dumping to display the loop information. Next, during VN computation, record loop memory dependence from `mapStores` with the tree currently being value numbered, whenever a value number comes from a particular map. There may be multiple such recording events per tree, so add logic on the recording side to track the most constraining dependence. Note value numbering happens in execution order, so there is an unambiguous current tree being value numbered. This dependence info is tracked via a side map. Finally, during hoisting, for each potentially hoistable tree, consult the side map to recover the loop memory dependence of a tree, and if that dependence is at or within the loop that we're hoisting from, block the hoist. I've also absorbed the former class var (static field) hosting exclusion into this new logic. This gives us slightly more relaxed dependence in some cases. Resolves dotnet#54118.
@briansull @jakobbotsch PTAL Passes the more elaborate test case from #54118 (added here). 5 methods with SPMI diffs (not counting the new test). Will say more about them in a follow-up note a bit later today. benchmarks.run.windows.x64.checked.mch:
Detail diffs
coreclr_tests.pmi.windows.x64.checked.mch:
Detail diffs
|
We previously moved a read past a write in a fully unrolled loop, now we don't.
|
x86 test failure seems unrelated; looks like an instance of #54469.
|
src/coreclr/jit/compiler.h
Outdated
// The map provides the entry block of the most closely enclosing loop that | ||
// defines the memory region accessed when defining the nodes's VN. | ||
// | ||
// This information should consulted when considering hoisting node out of a loop, as the VN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should => should be
src/coreclr/jit/optimizer.cpp
Outdated
// | ||
bool IsTreeLoopMemoryInvariant(GenTree* tree) | ||
{ | ||
if (tree->OperIsIndir() && ((tree->gtFlags & GTF_IND_INVARIANT) != 0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could add a comment, stating that we early out returning true for any GT_IND marked as Invariant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually we probably don't need this bail out as we should be recording that invariant VNs indirs are dependent on an invariant memory state. So I'll likely just delete this and verify we still get the same results.
src/coreclr/jit/optimizer.cpp
Outdated
return true; | ||
} | ||
|
||
// Todo: other operators that read memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going to have a ToDo here, then I think that we have a convention on how we write ToDo's
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a vestige from an earlier version where VN wasn't annotating all trees. But now it does.
So here I think we can just check all trees. However I believe calls are handled specially during hoisting, so we may still need an early bail-out for calls.
cc @dotnet/jit-contrib it might be interesting for more of you to look this one over, given our ongoing discussions of VN. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks Good
Forgot to mention that the revised version has the same set of SPMI diffs as the first version. |
I just rechecked my examples with newest fixes from master and it looks like there are still some unhandled cases: // Generated by Fuzzlyn v1.2 on 2021-07-22 13:37:14
// Seed: 14815563263006255362
// Reduced from 12.9 KiB to 0.4 KiB in 00:00:17
// Debug: Outputs 1
// Release: Outputs 0
public class Program
{
static short[] s_2;
public static void Main()
{
byte[] vr7 = new byte[]{0};
bool vr11 = default(bool);
for (int vr9 = 0; vr9 < 2; vr9++)
{
if (vr11)
{
s_2[0] = 0;
}
vr7[0] = 1;
byte vr10 = vr7[0];
System.Console.WriteLine(vr10);
}
}
} Looks like they all involve some form of control flow before the pattern. |
Thanks, will take a look. |
Seems like we need to add similar loop dependence tracking to |
int[] arr = { -1 }; | ||
ref int r = ref arr[0]; | ||
int val = -1; | ||
for (int i = 0; i < 2; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To test loop-dependent VN, these tests all depend on these loops remaining as loops. But Checked JIT loop unrolling under stress will probably fully unroll them. And what if we change the loop unrolling heuristics? Maybe the upper bound should be make an argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #56184
@@ -2136,7 +2136,7 @@ ValueNum ValueNumStore::VNForFunc( | |||
assert(arg0VN == VNNormalValue(arg0VN)); | |||
assert(arg1VN == VNNormalValue(arg1VN)); | |||
assert(arg2VN == VNNormalValue(arg2VN)); | |||
assert(arg3VN == VNNormalValue(arg3VN)); | |||
assert((func == VNF_MapStore) || (arg3VN == VNNormalValue(arg3VN))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment above: "// Note: Currently the only four operand func is the VNF_PtrToArrElem operation" is no longer true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed in #56184.
Specify `Overwrite` when setting loop dependence map entries, as we may refine the initial result. Fixes dotnet#56174. Extract loop dependence of `VNF_PhiMemoryDef`. Fixes new case noted in dotnet#55936, and 13/16 or so other cases Jakob sent me privately. Also update a comment and fix tests to work better with jitstress per other notes on that PR.
Specify `Overwrite` when setting loop dependence map entries, as we may refine the initial result. Fixes #56174. Extract loop dependence of `VNF_PhiMemoryDef`. Fixes new case noted in #55936, and 13/16 or so other cases Jakob sent me privately. Also update a comment and fix tests to work better with jitstress per other notes on that PR.
If a loop is removed (because of unrolling) then the loop dependence tracking introduced in dotnet#55936 and dotnet#56184 may not properly update. So when a loop is removed, walk up the chain of parent loops looking for one that is not removed, and record the dependence on that parent. Addresses last part of dotnet#54118.
…56436) If a loop is removed (because of unrolling) then the loop dependence tracking introduced in #55936 and #56184 may not properly update. So when a loop is removed, walk up the chain of parent loops looking for one that is not removed, and record the dependence on that parent. Addresses last part of #54118.
Leverage value numbering's alias analysis to annotate trees with the loop
memory dependence of the tree's value number.
First, refactor the
mapStore
value number so that it also tracks the loopnumber where the store occurs. This is done via an extra non-value-num arg,
so add appropriate bypasses to logic in the jit that expect to only find
value number args. Also update the dumping to display the loop information.
Next, during VN computation, record loop memory dependence from
mapStores
with the tree currently being value numbered, whenever a value number comes
from a particular map. There may be multiple such recording events per tree,
so add logic on the recording side to track the most constraining dependence.
Note value numbering happens in execution order, so there is an unambiguous
current tree being value numbered.
This dependence info is tracked via a side map.
Finally, during hoisting, for each potentially hoistable tree, consult the side
map to recover the loop memory dependence of a tree, and if that dependence is
at or within the loop that we're hoisting from, block the hoist.
I've also absorbed the former class var (static field) hosting exclusion into
this new logic. This gives us slightly more relaxed dependence in some cases.
Resolves #54118.