-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Branch folding #5851
base: main
Are you sure you want to change the base?
Branch folding #5851
Conversation
…s and remove references to not taken branches
…rameters to BlockCall args if there is a single pred and there is only one BlockCall in that inst that matches the block
This now cleans up the cfg and domtree as we find blocks which are not unreachable. Does not handle loops yet. This needed mutable access to the domtree inside egraph, which means AliasAnalysis could no longer hold a reference to the domtree. Changed AliasAnalysis to pass in the domtree for functions that need it.
… it only has one pred
…ected to be unreachable (loops)
This function contains both BrIf and BrTable with a loop and optimizes it down to a constant return. function u0:0() -> i32 system_v { block1: block2(v3: i32): block3: block4: block5(v7: i32): block6: block7: block8: block9(v11: i32): [cranelift_codegen::isa::x64] disassembly: |
This is a fantastic illustration of how much better Cranelift could do with all these optimizations combined. Thank you for putting it together! The example you've given in the comment above is compelling: I love seeing all that code disappear. I'm having trouble reviewing this all at once, though. I'm hoping we can split it into a series of smaller PRs. Individually they won't be nearly as effective, of course, but we need to be able to reason about the correctness of each change individually. What I'd like to see first is just brif/br_table folding. I'd like to leave out the changes to block parameters and the changes for domtree reconstruction. Maybe that just means putting your first two commits in a new PR? I suspect that piece by itself will be uncontroversial and easy to merge, while already being a big improvement. We'll need one more thing before merging it: some new small regression tests in Once that's merged, let's come back to the rest of the changes in this PR! |
I've written up a bunch of thoughts about our general requirements for branch folding in a pair of issues. These two issues describe essentially independent tasks, and both are complex enough that we need to solve them separately so we can have a chance at reviewing each of them. On top of that, a third independent issue may make the other two easier. So probably the best order to tackle these is:
The biggest challenge for all of this work, and for this PR as it stands now, is that we want the egraph pass to visit each basic block only once. In fact some fundamental invariants of our egraph implementation currently rely on this. Someday I want to relax that restriction and have the option of running equality saturation to a fixpoint, but we aren't there today. So we miss some optimization opportunities on loops, but we still find a lot of optimizations, and this restriction keeps the optimization pass fast. |
I have implemented branch folding for constant value conditionals inside the egraph midend. This covers BrIf and BrTable, converting them to Jump instructions and removing edges in the ControlFlowGraph and updating the DominatorTree, detecting unreachable code and removing them as well.
This necessitated having mutable access to the DominatorTree in egraph, so it had to be removed from AliasAnalysis and passed to AliasAnalysis for the functions that needed it. As a side benefit, this cleaned up some lifetimes sprinkled around to have a reference to DominatorTree in AliasAnalysis.
A block is determined to be unreachable if it no longer has any predecessors or has one predecessor and is the head of a loop.
After folding the branch and removing pred/succ to dead code, when a block is processed, a check is made to see if there are parameters to the block and there is one and only one pred to the block. If this is the case, the parameters are detached from the block and aliased to the passed arguments in the BlockCall.
I have chatted briefly with @cfallin while implementing this, and he suggested I create a PR to let people review the changes.