-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements for null check folding. #1735
Conversation
Framework diffs:
|
Benchmark diffs:
|
I measured throughput with pin on crossgen of System.Private.Corelib. It shows 0.02% regression, which is close to noise level of the measurements. |
Framework regressions are caused by different CSE and/or register allocation decisions after the morph add optimization and/or null check removal. In some cases the number of instructions stays the same but the code size increases because, e.g., we generate |
@dotnet/jit-contrib PTAL |
Example of an improvement diff: -cmp dword ptr [rsi], esi
-lea rdx, bword ptr [rsi+216]
-add rdx, 16
+lea rdx, bword ptr [rsi+232]
mov ebx, dword ptr [rdx] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall.
There are a few places (box opts is one, there may be others) where we emit nullchecks as unconsumed indirs. I wonder if it Is worth looking for these -- or trying once more to fix up box opts to use GT_NULLCHECK.
@@ -465,7 +465,7 @@ struct BasicBlock : private LIR::Range | |||
|
|||
#define BBF_COMPACT_UPD \ | |||
(BBF_CHANGED | BBF_GC_SAFE_POINT | BBF_HAS_JMP | BBF_NEEDS_GCPOLL | BBF_HAS_IDX_LEN | BBF_BACKWARD_JUMP | \ | |||
BBF_HAS_NEWARRAY | BBF_HAS_NEWOBJ) | |||
BBF_HAS_NEWARRAY | BBF_HAS_NEWOBJ | BBF_HAS_NULLCHECK) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious how you found these missing propagation bits... did you experiment with temporarily making optEarlyPropFor...
always return true? (if not, may be worth a try).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I just saw some null checks not getting removed in some diffs and tracked that down to the missing flags. Good idea to try with optDoEarlyPropFor...
returning always true. Will try tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added missing OMF_HAS_NULLCHECK
and BBF_HAS_NULLCHECK
in a couple of places but they didn't result in any new diffs. Changing optDoEarlyPropFor...
to always return true does result in diffs but they don't seem to be related to nullchecks. I will follow up outside of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently it was me who was forgetting to add these....
|
||
GenTree* Compiler::optFindNullCheckToFold(GenTree* tree, LocalNumberToNullCheckTreeMap* nullCheckMap) | ||
{ | ||
assert(tree->OperIsIndirOrArrLength()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder if this would read better if it was converted to early return style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
{ | ||
// Check if we have the pattern above and find the nullcheck node if we do. | ||
offsetValue += addr->gtGetOp2()->AsIntConCommon()->IconValue(); | ||
addr = addr->gtGetOp1(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe early exit here for large offsets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided not to do that: large offsets are rare and it's better to check in the end when we have the full offset.
src/coreclr/src/jit/earlyprop.cpp
Outdated
// | ||
// Return Value: | ||
// True if GT_NULLCHECK can be folded into a node that is after tree is execution order, | ||
// True if nullcheck may be folded into a node that is after tree is execution order, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo "tree is execution"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
// Assignments to user locals are disallowed because they may be live in the handler. | ||
result = (lhs->OperGet() == GT_LCL_VAR) && lvaTable[lhs->AsLclVarCommon()->GetLclNum()].lvIsTemp && | ||
((tree->gtGetOp2()->gtFlags & GTF_ASG) == 0); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we know which locals are live into handlers by this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I pushed a change to optCanMoveNullCheckPastTree that checks for lvLiveInOutOfHndlr and also fixes some issues with checking assignments. With this change I get additional improvements in frameworks with no new regressions:
Total bytes of diff: -7560 (-0.02% of base)
1538 total methods with Code Size differences (1526 improved, 12 regressed), 242942 unchanged.
and in benchmarks:
Total bytes of diff: -14 (-0.00% of base)
5 total methods with Code Size differences (5 improved, 0 regressed), 1888 unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since lvLiveInOutOfHndlr
is debug-only, I switched to checking lvVolatileHint
, which, despite the obscure name, checks exactly what's needed: vars live in handlers.
e19b4fd
to
d370840
Compare
The following generates a redundant null check private static object? JIT_ChkCastClassSpecial(void* toTypeHnd, object obj)
{
MethodTable* mt = RuntimeHelpers.GetMethodTable(obj); Just curious will this change help with that? (I think it might, but not completely sure). |
@VSadov can you paste the disassembly with the redundant null check you see in your example? |
|
Yes, I verified that the nullcheck in RuntimeHelpers:GetMethodTable(Object) is eliminated with these changes so it should be eliminated in the caller that inlines it as well. |
d370840
to
38e3c8b
Compare
I tried changing indirs to nullchecks in box opts and got some diffs that include regressions. There is some special logic that applies to indirs but doesn't apply to nullchecks. I will follow up outside of this PR. |
@AndyAyersMS I pushed several commits that address your feedback. PTAL. Current framework diff numbers:
Benchmark diffs:
|
38e3c8b
to
bc88444
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates. This looks good.
@@ -465,7 +465,7 @@ struct BasicBlock : private LIR::Range | |||
|
|||
#define BBF_COMPACT_UPD \ | |||
(BBF_CHANGED | BBF_GC_SAFE_POINT | BBF_HAS_JMP | BBF_NEEDS_GCPOLL | BBF_HAS_IDX_LEN | BBF_BACKWARD_JUMP | \ | |||
BBF_HAS_NEWARRAY | BBF_HAS_NEWOBJ) | |||
BBF_HAS_NEWARRAY | BBF_HAS_NEWOBJ | BBF_HAS_NULLCHECK) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently it was me who was forgetting to add these....
d7cf712
to
3f5b5f7
Compare
#1741 added another place in importer where we create a |
optFoldNullChecks attempts to remove GT_NULLCHECK nodes that are post-dominated by indirections on the same variable. These changes implement a number of improvements. 1. Recognize more patterns. Before these changes only the following pattern was recognized: x = comma(nullcheck(y), add(y, const1)) followed by indir(add(x, const2)) where const1 + const2 is sufficiently small. With these changes the following patterns are recognized: nullcheck(x) or x = comma(nullcheck(y), add(y, const1)) followed by indir(x) or indir(add(x, const2)) where const1 + const2 is sufficiently small. 2. Indirections now include GT_ARR_LENGTH nodes. 3. Morph has an optimization ((x+icon1)+icon2) => (x+(icon1+icon2)) These changes generalize it to handle commas: ((comma(y, x+icon1)+icon2) => comma(y, x+(icon1+icon2)) That exposes more trees to null check folding. 4. Fix a bug in flow transformations that could lose BBF_HAS_NULLCHECK flag on some basic blocks, which led to missing opportunities for null check folding. 5. Make safety checks in optCanMoveNullCheckPastTree (for trees between the nullcheck and the indirection) both more correct and less conservative. For example, we were not allowing any assignments if we were inside try; however, assignments to compiler temps are safe since they won't be visible in handlers. 5. Increase the maximum number of trees we check between GT_NULLCHECK and the indirection from 25 to 50. 7. Refactor the code and move pattern recognition and safety checks to helper methods. 8. Add missing BBF_HAS_NULLCHECK and OMF_HAS_NULLCHECK when we create GT_NULLCHECK nodes.
3f5b5f7
to
e5cfdd3
Compare
nullCheckTree->gtFlags |= GTF_ORDER_SIDEEFF; | ||
nullCheckTree->gtFlags |= GTF_IND_NONFAULTING; | ||
|
||
if (nullCheckParent != nullptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a change that added this if
to deal with cases when the GT_NULLCHECK
is the root node of the statement.
The only failing tests are eventpipe tests that are failing in other PRs as well. #1794 is attempting to fix them. The crossgen-comparison arm job was cancelled after an hour of waiting. |
After this change I see the following:
The redundant null check is gone. Thanks!! |
Thies file was added accidentally as part of dotnet#1735
Thies file was added accidentally as part of #1735
optFoldNullChecks attempts to remove GT_NULLCHECK nodes that are
post-dominated by indirections on the same variable. These changes
implement a number of improvements.
Recognize more patterns.
Before these changes only the following pattern was recognized:
x = comma(nullcheck(y), add(y, const1))
followed by
indir(add(x, const2))
where
const1 + const2
is sufficiently small.With these changes the following patterns are recognized:
nullcheck(x)
or
x = comma(nullcheck(y), add(y, const1))
followed by
indir(x)
or
indir(add(x, const2))
where
const1 + const2
is sufficiently small.Indirections now include GT_ARR_LENGTH nodes.
Morph has an optimization
((x+icon1)+icon2)
=>(x+(icon1+icon2))
These changes generalize it to handle commas:
((comma(y, x+icon1)+icon2)
=>comma(y, x+(icon1+icon2))
That exposes more trees to null check folding.
Fix a bug in flow transformations that could lose
BBF_HAS_NULLCHECK
flagon some basic blocks, which led to missing opportunities for null check folding.
Make safety checks in
optCanMoveNullCheckPastTree
(for trees between the
GT_NULLCHECK
and the indirection) both more correctand less conservative. For example, we were not allowing any assignments
if we were inside try; however, assignments to compiler temps are safe since
they won't be visible in handlers.
Increase the maximum number of trees we check between
GT_NULLCHECK
andthe indirection from 25 to 50.
Refactor the code and move pattern recognition and safety checks to
helper methods.
This addresses all relevant examples from https://github.com/dotnet/coreclr/issues/23903 .