Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Test performance of running MIR inliner on inline(always) function calls when mir-opt-level=1 #110560

Closed

Conversation

vlad20012
Copy link
Member

It seems #105278 is stalled, so I'd like to perform several performance tests with different MIR inliner setups.

Let's start with just reading a callee MIR body without actually inlining it (in both incremental and non-incremental configurations).

@rustbot
Copy link
Collaborator

rustbot commented Apr 19, 2023

r? @oli-obk

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 19, 2023
@rustbot
Copy link
Collaborator

rustbot commented Apr 19, 2023

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

@WaffleLapkin
Copy link
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 19, 2023
@bors
Copy link
Contributor

bors commented Apr 19, 2023

⌛ Trying commit 796cafe with merge b47b7746514197f42ef70d9744fcbbea0256a508...

@bors
Copy link
Contributor

bors commented Apr 19, 2023

☀️ Try build successful - checks-actions
Build commit: b47b7746514197f42ef70d9744fcbbea0256a508 (b47b7746514197f42ef70d9744fcbbea0256a508)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (b47b7746514197f42ef70d9744fcbbea0256a508): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
3.9% [0.2%, 45.1%] 58
Regressions ❌
(secondary)
2.0% [0.4%, 5.6%] 9
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 3.9% [0.2%, 45.1%] 58

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
5.4% [0.7%, 12.8%] 12
Regressions ❌
(secondary)
4.4% [2.5%, 5.8%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.9% [-3.9%, -3.9%] 1
All ❌✅ (primary) 5.4% [0.7%, 12.8%] 12

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
8.1% [1.0%, 57.8%] 27
Regressions ❌
(secondary)
5.2% [3.1%, 6.8%] 3
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 8.1% [1.0%, 57.8%] 27

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Apr 20, 2023
@vlad20012 vlad20012 marked this pull request as draft April 20, 2023 08:32
@vlad20012
Copy link
Member Author

vlad20012 commented Apr 20, 2023

⬆️ In that experiment I enabled Inlining pass for debug builds, but aborted any inlining attempt (in debug builds) right after check_mir_body invocation, so this is actually a measurement of infrastructure costs of inlining (i.e. calculating call graph, reading mir bodies, etc) without doing the inlining itself. Note that in this experiment I don't distinguish #[inline(always)], #[inline] functions or even functions without an #[inline] attribute.

debug incr-patched - up to 45% regression, 7.71% mean
debug full/incr-full - up to 5% regression, 2% mean
debug incr-unchanged - up to 3% regression, 1.5% mean

Not so bad for such a stressful experiment!


Let's now repeat the experiment, but early reject inline candidates without an #[inline(always)] attribute

@vlad20012 vlad20012 force-pushed the mir-opt-1-inline-always-experiment branch from db67429 to 796cafe Compare April 20, 2023 11:25
@oli-obk
Copy link
Contributor

oli-obk commented Apr 20, 2023

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 20, 2023
@bors
Copy link
Contributor

bors commented Apr 20, 2023

⌛ Trying commit e65665f with merge f283ebe5c544c33161c94d8b60b2423af93b8148...

@bors
Copy link
Contributor

bors commented Apr 20, 2023

☀️ Try build successful - checks-actions
Build commit: f283ebe5c544c33161c94d8b60b2423af93b8148 (f283ebe5c544c33161c94d8b60b2423af93b8148)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (f283ebe5c544c33161c94d8b60b2423af93b8148): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.1% [0.5%, 4.4%] 15
Regressions ❌
(secondary)
1.5% [0.3%, 2.6%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.5% [-2.5%, -2.5%] 1
All ❌✅ (primary) 1.1% [0.5%, 4.4%] 15

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.9% [1.9%, 1.9%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.6% [-2.3%, -0.4%] 5
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -1.0% [-2.3%, 1.9%] 6

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.3% [1.1%, 5.1%] 8
Regressions ❌
(secondary)
3.5% [3.4%, 3.5%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.3% [1.1%, 5.1%] 8

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 20, 2023
@vlad20012
Copy link
Member Author

vlad20012 commented Apr 20, 2023

⬆️ In that experiment, I enabled the Inlining pass for debug builds, but considered for inlining only functions marked as #[inline(always)] and aborted any inlining attempt (in debug builds) right after check_mir_body invocation, so this is actually a measurement of infrastructure costs of inlining (i.e. calculating call graph, reading mir bodies, etc) without doing the inlining itself.
Note that it differs from the previous experiment in that it considers only #[inline(always)] function (while the previous experiment considered any functions).

debug incr-patched - up to 4.4% regression, 0.5% mean
debug full/incr-full - up to 2% regression, 0.4% mean
debug incr-unchanged - up to 1% regression, 0.3% mean

Wow, much better! These numbers are very inspiring.


Let's now restrict the inlining consideration to non-local functions only (i.e. to function from other crates). If I understand it correctly, in this case, we will not call mir_callgraph_reachable query and hence we will skip the call graph calculation. This should improve the numbers a bit more. Note that we still don't perform the inlining itself!

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 20, 2023
@bors
Copy link
Contributor

bors commented Apr 20, 2023

⌛ Trying commit cd69787 with merge e5fd7c91e36b8c7f682f5c0fe631164c5d15e628...

@bors
Copy link
Contributor

bors commented Apr 20, 2023

☀️ Try build successful - checks-actions
Build commit: e5fd7c91e36b8c7f682f5c0fe631164c5d15e628 (e5fd7c91e36b8c7f682f5c0fe631164c5d15e628)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (e5fd7c91e36b8c7f682f5c0fe631164c5d15e628): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.7% [0.4%, 1.2%] 9
Regressions ❌
(secondary)
1.6% [0.4%, 2.7%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.7% [0.4%, 1.2%] 9

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.7% [-3.4%, -2.0%] 2
All ❌✅ (primary) - - 0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.3% [1.0%, 1.6%] 3
Regressions ❌
(secondary)
2.9% [2.9%, 2.9%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.3% [1.0%, 1.6%] 3

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 20, 2023
@vlad20012
Copy link
Member Author

vlad20012 commented Apr 21, 2023

⬆️ In that experiment, I enabled the Inlining pass for debug builds, but considered for inlining only non-local functions marked as #[inline(always)] and aborted any inlining attempt (in debug builds) right after check_mir_body invocation, so this is actually a measurement of infrastructure costs of inlining (i.e. reading mir bodies, etc) without doing the inlining itself.
Note that it differs from the previous experiment in that it considers non-local functions only (i.e. functions from other crates).

debug incr-patched - up to 0.5% regression, 0.1% mean
debug full/incr-full - up to 3% regression, 0.4% mean
debug incr-unchanged - up to 0.5% regression, 0.1% mean

It looks like there are very few regressions now! The most promising is that now there's almost no regression in incr-patched/incr-unchanged scenarios, so it seems potentially possible to consider enabling inlining even in incremental configuration.


Let's do some real inlining now! In the next experiment, I'm enabling inlining for debug builds (mir-opt-level=1) with all the previous restricting rules.

@vlad20012 vlad20012 force-pushed the mir-opt-1-inline-always-experiment branch from 2db86ac to 43695a4 Compare April 21, 2023 09:12
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@vlad20012 vlad20012 force-pushed the mir-opt-1-inline-always-experiment branch from 4078064 to 40cf2cb Compare April 21, 2023 12:48
@WaffleLapkin
Copy link
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 21, 2023
@WaffleLapkin WaffleLapkin reopened this Apr 21, 2023
@WaffleLapkin
Copy link
Member

@bors try

@bors
Copy link
Contributor

bors commented Apr 21, 2023

⌛ Trying commit 40cf2cb with merge 5142ecd1025428756f8bad85db21c70248982db7...

@bors
Copy link
Contributor

bors commented Apr 21, 2023

☀️ Try build successful - checks-actions
Build commit: 5142ecd1025428756f8bad85db21c70248982db7 (5142ecd1025428756f8bad85db21c70248982db7)

1 similar comment
@bors
Copy link
Contributor

bors commented Apr 21, 2023

☀️ Try build successful - checks-actions
Build commit: 5142ecd1025428756f8bad85db21c70248982db7 (5142ecd1025428756f8bad85db21c70248982db7)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (5142ecd1025428756f8bad85db21c70248982db7): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.9% [0.2%, 3.0%] 21
Regressions ❌
(secondary)
1.2% [0.4%, 2.6%] 8
Improvements ✅
(primary)
-0.7% [-1.0%, -0.3%] 5
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.6% [-1.0%, 3.0%] 26

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.4% [0.1%, 2.4%] 10
Regressions ❌
(secondary)
2.4% [2.4%, 2.4%] 1
Improvements ✅
(primary)
-2.1% [-2.1%, -2.1%] 1
Improvements ✅
(secondary)
-2.9% [-2.9%, -2.9%] 1
All ❌✅ (primary) 1.0% [-2.1%, 2.4%] 11

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.8% [1.1%, 2.6%] 8
Regressions ❌
(secondary)
2.6% [2.6%, 2.6%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-6.9% [-7.1%, -6.7%] 3
All ❌✅ (primary) 1.8% [1.1%, 2.6%] 8

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 21, 2023
@vlad20012
Copy link
Member Author

vlad20012 commented May 20, 2023

⬆️ In that experiment, I enabled the Inlining pass for debug builds (mir-opt-level=1), but considered for inlining only non-local functions (i.e. functions from other crates) marked as #[inline(always)].
Note that it differs from the previous experiment in that it really does inlining.

debug full/incr-full: 1.3% regression in serde and hyper. The most regressed query is optimized_mir. Also, there is a performance win: -1% in webrender! The most affected query is LLVM_module_codegen_emit_obj.

debug incr-unchanged: 3% regression in hyper. The most regressed queries are generate_crate_metadata, optimized_mir and encode_query_results_for

debug incr-patched: 2.6% regression in hyper. The most regressed queries are generate_crate_metadata, LLVM_module_codegen_emit_obj, encode_query_results_for, mir_for_ctfe and optimized_mir

These results show that the case is not hopeless, and enabling #[inline(always)] is achievable, perhaps with more restrictions. But at the moment I want to try enabling inlining with similar restrictions in incremental optimized builds. I'll try it in the separate PR.

@vlad20012 vlad20012 closed this May 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants