-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Examples where CSE does not handle trees marked GTF_MAKE_CSE
#92170
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsHoisting will mark any trees that hit hoists with In asp.net I see around 1850 cases total. Here's some analysis. Constants (~1400)These never even become CSE candidates. Hoisting invokes
Hoisted tree is part of a bigger CSE
Here the tree [000742] that inspired hoisting is part of a bigger CSE, and the subtrees are never considered as candidates.
There is perhaps a tension deciding which is better, but it seems like we can actually do both: CSE the subtree, ValueNum is constant (~25)
The value number for the tree is constant, and CSE defers to Assertion Prop. This may be pragmatic, as Assertion Prop has the logic to materialize the constant. But it may mean "expensive" constructed constants might not get hoisted or CSEd, so it seems like some rethinking here is warranted. Hoist of a side-effecting tree (~260)
Here presumably we don't need to mark the hoisted tree with The current JIT code marks the entire subtree; I changed that for this experiment. Perhaps instead we could mark all subtrees that qualify under Failed the CSE profitability heuristic
This may be reasonable. The use and def are in blocks with relative weight 0.84, so we're weighing the benefit of the On the other hand if the loop iterates a bit more often that PGO data indicates, we will wish we had done the CSE. OtherI only looked at 10 or so methods, so there might be other explanations.
|
FYI @dotnet/jit-contrib |
Will defer working on this until after .NET 9, as we hope the ML heuristics effort can capture this. |
#97042 (comment) has another example. IMO we should look at improving hoisting here to stop relying on CSE (and, in my opinion, invariance through VNs). |
Hoisting will mark any trees that it hoists with
GTF_MAKE_CSE
. Oddly CSE never looks for this flag. I was curious how many of these cases end up not getting hoisted because CSE does not play along.In asp.net I see around 1850 cases total. Here's some analysis.
Constants (~1400)
These never even become CSE candidates. Hoisting invokes
optIsCSEcandidate
to screen things, but this generally allows all constants to be considered hoistable. CSE does additional checks and disables constant hoisting except on arm64. We should either stop hosting these constants or make them candidates.Hoisted tree is part of a bigger CSE
Here the tree [000742] that inspired hoisting is part of a bigger CSE, and the subtrees are never considered as candidates.
There is perhaps a tension deciding which is better, but it seems like we can actually do both: CSE the subtree,
then CSE the add sitting on top.
ValueNum is constant (~25)
The value number for the tree is constant, and CSE defers to Assertion Prop.
This may be pragmatic, as Assertion Prop has the logic to materialize the constant. But it may mean "expensive" constructed constants might not get hoisted or CSEd, so it seems like some rethinking here is warranted.
Hoist of a side-effecting tree (~260)
Here presumably we don't need to mark the hoisted tree with
GTF_MAKE_CSE
as there is nothing CSE can do, and the preheader tree is not going to be removed. However if the tree also had interesting and expensive subtrees, we might want to mark those to make sure we CSE them.The current JIT code marks the entire subtree; I changed that for this experiment. Perhaps instead we could mark all subtrees that qualify under
optIsCSEcandidate
, or the "side effect free" leader subtrees.Failed the CSE profitability heuristic
This may be reasonable. The use and def are in blocks with relative weight 0.84, so we're weighing the benefit of the
CSE versus the apparent cost of burining a callee save (since this is live across call CSE).
On the other hand if the loop iterates a bit more often that PGO data indicates, we will wish we had done the CSE.
This is an example with asymmetric upside/downside and we likely should be more aggressive in such cases.
Other
I only looked at 10 or so methods, so there might be other explanations.
category:cq
theme:cse
skill-level:intermediate
cost:small
impact:small
The text was updated successfully, but these errors were encountered: