MIR-OPT: Pass to deduplicate blocks #77551

simonvandel · 2020-10-04T21:58:54Z

This pass finds basic blocks that are completely equal,
and replaces all uses with just one of them.

$ RUSTC_LOG=rustc_mir::transform::deduplicate_blocks ./x.py build --stage 2 | grep "SUCCESS: Replacing: " > log
...
$ cat log | wc -l
23875

rust-highfive · 2020-10-04T21:58:57Z

r? @oli-obk

(rust_highfive has picked a reviewer for you, use r? to override)

jonas-schievink · 2020-10-04T22:33:13Z

@bors try @rust-timer queue

rust-timer · 2020-10-04T22:33:14Z

Awaiting bors try build completion

bors · 2020-10-04T22:33:25Z

⌛ Trying commit 6d10adca99e598a96de8dc438de2b2bbfe974fc7 with merge e8bab90e8bb7eaee4e142e99267ec4dacdfd3985...

bors · 2020-10-04T23:17:11Z

☀️ Try build successful - checks-actions, checks-azure
Build commit: e8bab90e8bb7eaee4e142e99267ec4dacdfd3985 (e8bab90e8bb7eaee4e142e99267ec4dacdfd3985)

rust-timer · 2020-10-04T23:17:13Z

Queued e8bab90e8bb7eaee4e142e99267ec4dacdfd3985 with parent 4ccf5f7, future comparison URL.

rust-timer · 2020-10-05T01:29:48Z

Finished benchmarking try commit (e8bab90e8bb7eaee4e142e99267ec4dacdfd3985): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never

simonvandel · 2020-10-05T06:51:57Z

Performance does not look good. I'll setup a profiler in the coming days to see if it can be improved.

compiler/rustc_mir/src/transform/deduplicate_blocks.rs

simonvandel · 2020-10-05T17:29:14Z

compiler/rustc_mir/src/transform/deduplicate_blocks.rs

+fn rvalue_eq(lhs: &Rvalue<'tcx>, rhs: &Rvalue<'tcx>) -> bool {
+    let res = match (lhs, rhs) {
+        (
+            Rvalue::Use(Operand::Constant(box Constant { user_ty: _, literal, span: _ })),


The only reason for not using == is to not compare span. See #77549 (comment)
Should we have a compare_without_user_info? Not comparing span is an issue that most optimization passes would like to solve I guess

This is a riddiculous footgun. I'll open an issue to discuss this

simonvandel · 2020-10-05T19:52:50Z

Performance should be quite a bit better with the latest commit. Unnecessary tuple combinations are still being created though. I'll see if this can be removed.
If the perf queue is empty, it would be interesting to see how it fares now.

simonvandel · 2020-10-05T21:30:25Z

Unnecessary tuple combinations should be gone now. Can I get a perf run again?

oli-obk · 2020-10-06T07:19:07Z

@bors try @rust-timer queue

rust-timer · 2020-10-06T07:19:08Z

Awaiting bors try build completion

bors · 2020-10-06T07:19:27Z

⌛ Trying commit 44b87ce06d5582a00b5b21688834e8d03f7b5600 with merge 50438f458aff5610389abb267c5b5b89fbb68b04...

bors · 2020-10-06T08:05:21Z

☀️ Try build successful - checks-actions, checks-azure
Build commit: 50438f458aff5610389abb267c5b5b89fbb68b04 (50438f458aff5610389abb267c5b5b89fbb68b04)

rust-timer · 2020-10-06T08:05:23Z

Queued 50438f458aff5610389abb267c5b5b89fbb68b04 with parent a1dfd24, future comparison URL.

rust-timer · 2020-10-06T14:38:28Z

Finished benchmarking try commit (50438f458aff5610389abb267c5b5b89fbb68b04): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never

simonvandel · 2020-10-06T15:20:41Z

Based on the perf results, there is absolutely still room for improvement. I'll see what can be done after some profiling.

The regressions in optimized_mir make sense as we are doing more work (and maybe too much), but it's not intuitive to mee why the LLVM passes seem to regress. Intuitively, when handing fewer basic blocks to LLVM, it itself has to do less work.

Any ideas? I guess we can verify that we are indeed handing simpler IR to LLVM using cargo-lines. If it turns out we are not, the problem could be that deduplicating blocks in Mir confuses some of the existing Mir opt passes.

oli-obk · 2020-10-07T06:32:06Z

You can basically ignore these llvm regressions as they are very small and only during incremental compilation afaict. They happen because the number of codegen units changes when a function gets optimized too well in MIR.

One thing that I found curious is that the ctfe stress tests are regressing during mir interpretation. Shouldn't we get less code and thus do less work?

About the optimized_mir regressions, maybe it's time to include optimization fuel so that the optimization is aborted if it becomes too expensive?

ArniDagur · 2020-10-07T23:49:44Z

compiler/rustc_mir/src/transform/deduplicate_blocks.rs

+            debug!("tuple_iter: {:?}", tuple_iter);
+
+            let statementkinds1 = body.basic_blocks()[item1].statements.iter().map(|x| &x.kind);
+            let statementkinds2 = body.basic_blocks()[item2].statements.iter().map(|x| &x.kind);


Perhaps you can gain some performance by moving these two lines out of the while loop, so we're not evaluating them many times?

ArniDagur · 2020-10-07T23:59:44Z

compiler/rustc_mir/src/transform/deduplicate_blocks.rs

+    //      - This is needed for itertools::group_by which only assigns consecutive elements to the same group
+    // 2. Group by (length of statements, TerminatorKind)
+    // 3. Compare StatementKinds pairwise inside the group
+    //      - This is technically O(n²), but `n` should be small in most cases


If the reason why the unicode_normalization benchmark is slow is that n is actually large, I wonder if we know why that is? Maybe a large table of constants or something?

ArniDagur · 2020-10-08T00:05:13Z

compiler/rustc_mir/src/transform/deduplicate_blocks.rs

+
+        let mut tuple_iter = TupleCombinationsWithSkip::new(group);
+        debug!("tuple_iter: {:?}", tuple_iter);
+        while let Some((item1, item2)) = tuple_iter.next() {


Surely the most common case are (len statements, terminator kind) groups of size one, right? Could it be beneficial to filter them out early, since there obviously can't be any duplicates?

Yep, I have added a filtering group size > 1 in the latest commits

simonvandel · 2020-10-10T15:41:59Z

The pass should be a lot faster with the latest changes. Can I get another perf run?

oli-obk · 2020-12-31T15:19:59Z

We discussed it at the compiler team meeting. A few points came up

The metric of "how many times is it run" is not as useful as "how many statements were deduplicated because of this"
Do we get any runtime benefit out of it, like are subsequent (or LLVM) optimizations more effective now?
Can it be more effective if we allow it to ignore spans and other debug info for finding duplicates? (actual deduplication should merge, not drop such info)
Should we just put it into mir-opt-level=3 for now?

simonvandel · 2021-01-03T15:39:03Z

The metric of "how many times is it run" is not as useful as "how many statements were deduplicated because of this"

Yeah, I agree.

Do we get any runtime benefit out of it, like are subsequent (or LLVM) optimizations more effective now?

This would be interesting to check, yeah. The runtime benchmarks would have to be cherry-picked though to actually have duplicated blocks. rustc-perf Webrender triggered the deduplication removed ~2% of statements last i tried, but i don't see Webrender having any benchmarks that can easily be run. Any ideas?

Can it be more effective if we allow it to ignore spans and other debug info for finding duplicates? (actual deduplication should merge, not drop such info)

Ignoring spans and debug info would be a great improvement to finding duplicates for sure. I'm not sure how to do deduplication without throwing some span away. How would the pass know which of multiple spans to preserve?

Should we just put it into mir-opt-level=3 for now?

Until clear compile-time wins (or runtime) improvements are seen, yeah, it should. I guess next step would be to figure out how to ignore spans, while not throwing information away.

For what is Span used for if a local is not used for debuginfo? Constants, for instance, seem to get different spans quite often. See https://godbolt.org/z/YK6j8W .

oli-obk · 2021-01-03T16:13:10Z

Ignoring spans and debug info would be a great improvement to finding duplicates for sure. I'm not sure how to do deduplication without throwing some span away. How would the pass know which of multiple spans to preserve?

I had a similar issue with MIR inlining. All the spans were pointing into the inlined MIR body, and all I could do was rewrite them to point at the callsite where it was inlined. Neither solution was good. So I implemented another expansion variant. Look at the ui test diff (all the way at the bottom if the github link doesn't work) to see how such a span could be rendered.

I think in this situation, since there is no hierarchy of spans, I would create a span that maybe points to whatever the two merged spans have in common, and if they do not have this, make an arbitrary choice for the root span. Then the two spans can be put into an ExpansionInfo variant and the span rendering can do something similar to the inlining span thing.

oli-obk · 2021-01-03T16:14:27Z

So for the quickest way forward, let's put it into mir-opt-level 3 and merge it, then figure out the span and source_info handling (which we should probably do for more optimizations anyway).

JohnCSimon · 2021-02-14T15:49:46Z

Ping from triage: @simonvandel can you post your status on this PR and resolve the merge conflicts? Thank you

simonvandel · 2021-02-21T21:09:56Z

@oli-obk rebased on master, and gated the deduplication behind mir-opt-level >= 3

oli-obk · 2021-02-22T10:53:40Z

@bors r+ rollup=never

bors · 2021-02-22T10:53:41Z

📌 Commit 2d1e0ad has been approved by oli-obk

bors · 2021-02-22T12:14:28Z

⌛ Testing commit 2d1e0ad with merge 15598a8...

bors · 2021-02-22T15:45:32Z

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing 15598a8 to master...

rust-highfive assigned oli-obk Oct 4, 2020

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 4, 2020

simonvandel force-pushed the extend-simplify-branch-same branch from bca4c75 to 6d10adc Compare October 4, 2020 22:07

LingMan reviewed Oct 5, 2020

View reviewed changes

compiler/rustc_mir/src/transform/deduplicate_blocks.rs Outdated Show resolved Hide resolved

simonvandel commented Oct 5, 2020

View reviewed changes

oli-obk mentioned this pull request Oct 6, 2020

mir::Constant::span is a footgun for mir optimizations #77608

Open

ArniDagur reviewed Oct 7, 2020

View reviewed changes

ArniDagur reviewed Oct 8, 2020

View reviewed changes

oli-obk removed I-needs-decision Issue: In need of a decision. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 31, 2020

camelid added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jan 15, 2021

pnkfelix added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). labels Jan 21, 2021

JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 14, 2021

simonvandel force-pushed the extend-simplify-branch-same branch from 97a4b4b to 149badd Compare February 21, 2021 20:21

simonvandel added 2 commits February 21, 2021 21:22

Drive-by formatting of comment

ccecc4f

Make MatchBranchSimplification clean up after itself

1e86570

simonvandel force-pushed the extend-simplify-branch-same branch from 149badd to bd989b9 Compare February 21, 2021 20:51

New pass to deduplicate blocks

2d1e0ad

simonvandel force-pushed the extend-simplify-branch-same branch from bd989b9 to 2d1e0ad Compare February 21, 2021 20:52

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 22, 2021

bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 22, 2021

bors merged commit 15598a8 into rust-lang:master Feb 22, 2021

rustbot added this to the 1.52.0 milestone Feb 22, 2021

bors mentioned this pull request Feb 22, 2021

New mir-opt pass to simplify gotos with const values (reopening #77486) #80475

Merged

oli-obk mentioned this pull request Feb 11, 2025

Remove the deduplicate_blocks pass #136786

Merged

MIR-OPT: Pass to deduplicate blocks #77551

MIR-OPT: Pass to deduplicate blocks #77551

Conversation

simonvandel commented Oct 4, 2020 • edited Loading

rust-highfive commented Oct 4, 2020

jonas-schievink commented Oct 4, 2020

rust-timer commented Oct 4, 2020

bors commented Oct 4, 2020

bors commented Oct 4, 2020

rust-timer commented Oct 4, 2020

rust-timer commented Oct 5, 2020

simonvandel commented Oct 5, 2020

simonvandel Oct 5, 2020

Choose a reason for hiding this comment

oli-obk Oct 6, 2020

Choose a reason for hiding this comment

oli-obk Oct 6, 2020

Choose a reason for hiding this comment

simonvandel commented Oct 5, 2020

simonvandel commented Oct 5, 2020

oli-obk commented Oct 6, 2020

rust-timer commented Oct 6, 2020

bors commented Oct 6, 2020

bors commented Oct 6, 2020

rust-timer commented Oct 6, 2020

rust-timer commented Oct 6, 2020

simonvandel commented Oct 6, 2020

oli-obk commented Oct 7, 2020

ArniDagur Oct 7, 2020

Choose a reason for hiding this comment

ArniDagur Oct 7, 2020

Choose a reason for hiding this comment

ArniDagur Oct 8, 2020

Choose a reason for hiding this comment

simonvandel Oct 10, 2020

Choose a reason for hiding this comment

simonvandel commented Oct 10, 2020

oli-obk commented Dec 31, 2020

simonvandel commented Jan 3, 2021

oli-obk commented Jan 3, 2021

oli-obk commented Jan 3, 2021

JohnCSimon commented Feb 14, 2021

simonvandel commented Feb 21, 2021

oli-obk commented Feb 22, 2021

bors commented Feb 22, 2021

bors commented Feb 22, 2021

bors commented Feb 22, 2021

simonvandel commented Oct 4, 2020 •

edited

Loading