Use dynamic dispatch for queries #108638

Zoxc · 2023-03-02T04:18:47Z

This replaces most concrete query values V with MaybeUninit<[u8; { size_of::<V>() }]> reducing the code instantiated by queries. The compile time of rustc_query_impl is reduced by 27%. It is an alternative to #107937 which uses unstable const generics while this uses a EraseType trait which maps query values to their erased variant.

This is achieved by introducing an Erased type which does sanity check with cfg(debug_assertions). The query caches gets instantiated with these erased types leaving the code in rustc_query_system unaware of them. rustc_query_system is changed to use instances of QueryConfig so that rustc_query_impl can pass in DynamicConfig which holds a pointer to a virtual table.

Benchmark	Before	After
Benchmark	Time	Time	%
🟣 clap:check	1.7055s	1.6949s	-0.62%
🟣 hyper:check	0.2547s	0.2528s	-0.73%
🟣 regex:check	0.9590s	0.9553s	-0.39%
🟣 syn:check	1.5457s	1.5440s	-0.11%
🟣 syntex_syntax:check	5.9092s	5.9009s	-0.14%
Total	10.3741s	10.3479s	-0.25%
Summary	1.0000s	0.9960s	-0.40%

Benchmark	Before	After
Benchmark	Time	Time	%
🟣 clap:check:initial	2.0605s	2.0575s	-0.15%
🟣 hyper:check:initial	0.3218s	0.3216s	-0.07%
🟣 regex:check:initial	1.1848s	1.1839s	-0.07%
🟣 syn:check:initial	1.9409s	1.9376s	-0.17%
🟣 syntex_syntax:check:initial	7.3105s	7.2928s	-0.24%
Total	12.8185s	12.7935s	-0.20%
Summary	1.0000s	0.9986s	-0.14%

Benchmark	Before	After
Benchmark	Time	Time	%
🟣 clap:check:unchanged	0.4606s	0.4617s	0.24%
🟣 hyper:check:unchanged	0.1335s	0.1336s	0.08%
🟣 regex:check:unchanged	0.3324s	0.3346s	0.65%
🟣 syn:check:unchanged	0.6268s	0.6307s	0.64%
🟣 syntex_syntax:check:unchanged	1.8248s	1.8508s	💔 1.43%
Total	3.3779s	3.4113s	0.99%
Summary	1.0000s	1.0061s	0.61%

It's based on #108167.

r? @cjgillot

rustbot · 2023-03-02T04:18:57Z

These commits modify the Cargo.lock file. Random changes to Cargo.lock can be introduced when switching branches and rebasing PRs.
This was probably unintentional and should be reverted before this PR is merged.

If this was intentional then you can ignore this comment.

Zoxc · 2023-03-02T06:01:37Z

This could use a perf run.

Noratrieb · 2023-03-02T07:03:29Z

@bors try @rust-timer queue

bors · 2023-03-02T07:03:38Z

⌛ Trying commit 4b8bcb6bb27ac0f3fabaa33bedfd3921487100ea with merge 97d7f37551bbf61e755d3ee780318d260f4dfb64...

bors · 2023-03-02T09:14:32Z

☀️ Try build successful - checks-actions
Build commit: 97d7f37551bbf61e755d3ee780318d260f4dfb64 (97d7f37551bbf61e755d3ee780318d260f4dfb64)

rust-timer · 2023-03-02T11:21:35Z

Finished benchmarking commit (97d7f37551bbf61e755d3ee780318d260f4dfb64): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.1%	[0.3%, 2.0%]	108
Regressions ❌ (secondary)	0.9%	[0.2%, 2.0%]	69
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.4%	[-0.4%, -0.4%]	1
All ❌✅ (primary)	1.1%	[0.3%, 2.0%]	108

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.0%	[-1.5%, -0.6%]	13
Improvements ✅ (secondary)	-3.0%	[-6.2%, -1.0%]	12
All ❌✅ (primary)	-1.0%	[-1.5%, -0.6%]	13

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.3%	[1.2%, 1.4%]	5
Regressions ❌ (secondary)	2.1%	[2.0%, 2.1%]	2
Improvements ✅ (primary)	-2.2%	[-2.2%, -2.2%]	1
Improvements ✅ (secondary)	-1.2%	[-1.7%, -0.6%]	2
All ❌✅ (primary)	0.7%	[-2.2%, 1.4%]	6

compiler/rustc_middle/src/query/erase.rs

Zoxc · 2023-03-02T11:41:25Z

bitmaps seems to be the largest instruction regression, runtime it's a bit lower though:

Benchmark	Before	After
Benchmark	Time	Time	%
🟣 bitmaps:check:unchanged	0.3744s	0.3773s	0.78%
Total	0.3744s	0.3773s	0.78%
Summary	1.0000s	1.0078s	0.78%

bors · 2023-03-07T19:08:54Z

☔ The latest upstream changes (presumably #108863) made this pull request unmergeable. Please resolve the merge conflicts.

cjgillot

This PR is very complex. It will take me a few more passes to fully digest it.
Thanks you for the work (and your future patience).

The bootstrap gains are nice, but the 2% regression is a bit much.
Do you have ideas to mitigate it?

compiler/rustc_middle/src/query/erase.rs

compiler/rustc_query_impl/src/lib.rs

cjgillot · 2023-03-07T19:24:44Z

compiler/rustc_query_impl/src/lib.rs

+}
+
+impl<'tcx, C: QueryCache, Anon: Bool, DepthLimit: Bool, Feedable: Bool> QueryConfig<QueryCtxt<'tcx>>
+    for DynamicConfig<'tcx, C, Anon, DepthLimit, Feedable>


IIUC, we now have 2 (Anon) x 2 (DepthLimit) x 2 (Feedable) x 3 (types of cache) x (number of value sizes) instance of the functions in rustc_query_system.
Is there much gained by having anon/depth_limit/feedable statically, instead of plain booleans?

Anon, DepthLimit and Feedable have few instances, so the cost of specializing for them is quite low at the moment. Fully erased we reduce things to 94 instances so few additional ones isn't impactful.

compiler/rustc_query_impl/src/lib.rs

cjgillot · 2023-03-07T19:33:48Z

compiler/rustc_query_impl/src/plumbing.rs

                mode: QueryMode,
-            ) -> Option<query_values::$name<'tcx>> {
+            ) -> Option<Erase<query_values::$name<'tcx>>> {


We could use a query_erased::$name alias for the return type.

compiler/rustc_middle/src/query/erase.rs

Zoxc · 2023-03-09T05:55:06Z

There's 2 additional optimizations that can be done, re-merging query state and query cache and making compute call providers directly by having the provider side do erasure. I suspect these would be insufficient to offset the incremental loss though. There might also be some inlining differences causing parts of the incremental regressions. I've not yet looked at the optimized code output.

rust-timer · 2023-04-30T14:27:37Z

Finished benchmarking commit (c5ee29edea029fc90cd1b1fcf1c9701733aef5aa): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.7%	[0.3%, 1.0%]	92
Regressions ❌ (secondary)	0.8%	[0.2%, 2.8%]	47
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.2%	[-0.3%, -0.2%]	3
All ❌✅ (primary)	0.7%	[0.3%, 1.0%]	92

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.9%	[2.9%, 2.9%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.9%	[-2.9%, -2.9%]	1
All ❌✅ (primary)	-	-	0

Cycles

This benchmark run did not return any relevant results for this metric.

apiraino · 2023-05-03T16:35:44Z

hello checking progress. Probably I can switch to waiting on author to comment on the latest perf. run. @Zoxc Feel free to request a review with @rustbot ready, thanks!

@rustbot author

Zoxc · 2023-05-12T20:31:32Z

I think the current state is decent enough. I did some more local testing and the incremental regressions seems to be about 0.3% on average (perf showing 0.2% for primary benchmarks on incr-unchanged + incr-patched).

The performance to compile time ratio improvement is good. For builds with 1 CGU it's a 15% reduction in the compile time of the compiler with 8 cores. This means that switching to building the compiler with 1 CGU would give us extra performance while building faster than with 16 CGUs with 8 cores (which is the highest CI uses).

@rustbot ready

cjgillot · 2023-05-14T13:24:00Z

I agree, the perf effect is only visible on incr-unchanged cases, and ~1% on perf suite. Meanwhile, we get a ~20s gain on bootstrap.
@bors r+

bors · 2023-05-14T13:24:02Z

📌 Commit 7aab1dd has been approved by cjgillot

It is now in the queue for this repository.

bors · 2023-05-14T13:47:04Z

⌛ Testing commit 7aab1dd with merge 8e8116c...

klensy · 2023-05-14T14:26:10Z

compiler/rustc_query_impl/Cargo.toml

+memoffset = { version = "0.6.0", features = ["unstable_const"] }
+field-offset = "0.3.5"


This creates 3 versions of memoffset (0.6, 0.7, 0.8). Is there particular reason to use 0.6 version instead of 0.7 or 0.8?

bors · 2023-05-14T16:16:09Z

☀️ Test successful - checks-actions
Approved by: cjgillot
Pushing 8e8116c to master...

rust-timer · 2023-05-14T18:04:41Z

Finished benchmarking commit (8e8116c): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.6%	[0.3%, 1.0%]	74
Regressions ❌ (secondary)	0.7%	[0.1%, 2.2%]	61
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.6%	[0.3%, 1.0%]	74

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.2%	[2.2%, 2.2%]	1
Improvements ✅ (primary)	-2.3%	[-2.3%, -2.3%]	1
Improvements ✅ (secondary)	-2.2%	[-3.3%, -1.0%]	9
All ❌✅ (primary)	-2.3%	[-2.3%, -2.3%]	1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 659.942s -> 641.664s (-2.77%)

klensy · 2023-05-27T10:50:08Z

compiler/rustc_query_impl/src/plumbing.rs

+        pub fn dynamic_queries<'tcx>() -> DynamicQueries<'tcx> {
+            DynamicQueries {
+                $(
+                    $name: dynamic_query::$name(),
+                )*
            }
-        })*
+        }


And now this is top4 function by size 56kb long.

That doesn't sound right? What's the symbol name and content?

_RNvCscI0NJ2NbQrx_16rustc_query_impl15dynamic_queries() | 55826, i guess inlined dynamic_query calls? I'll try place inline(never) over it.

Hm.. I guess they all could end up getting inlined. It should be a single basic block though, so it might not affect compile time much.

…lacrum deps: bump crates Updates few deps: drops a lot of cxx* crates: ```console $ cargo update -p iana-time-zone-haiku Updating crates.io index Updating cc v1.0.77 -> v1.0.79 Removing codespan-reporting v0.11.1 Removing cxx v1.0.94 Removing cxx-build v1.0.94 Removing cxxbridge-flags v1.0.94 Removing cxxbridge-macro v1.0.94 Updating iana-time-zone-haiku v0.1.1 -> v0.1.2 Removing link-cplusplus v1.0.8 Removing scratch v1.0.5 ``` cc: https://github.com/rust-lang/cc-rs/releases/tag/1.0.78, https://github.com/rust-lang/cc-rs/releases/tag/1.0.79 iana-time-zone-haiku: https://github.com/strawlab/iana-time-zone/releases/tag/haiku%2Fv0.1.2 fixed crossbeam-rs/crossbeam#972 (similar fixed in rust repo rust-lang#110089) ```console $ cargo update -p crossbeam-channel Updating crates.io index Updating crossbeam-channel v0.5.6 -> v0.5.8 ``` https://github.com/crossbeam-rs/crossbeam/blob/master/crossbeam-channel/CHANGELOG.md#version-058 dedupes memoffset versions: ```console $ cargo update -p crossbeam-epoch Updating crates.io index Updating crossbeam-epoch v0.9.13 -> v0.9.14 Removing memoffset v0.7.1 ``` https://github.com/crossbeam-rs/crossbeam/blob/master/crossbeam-epoch/CHANGELOG.md#version-0914 Gilnaa/memoffset@v0.6.5...v0.8.0 rust-lang#108638 (comment) dedupes bstr versions ```console $ cargo update -p ignore -p opener Updating crates.io index Removing bstr v0.2.17 Updating globset v0.4.9 -> v0.4.10 Updating ignore v0.4.18 -> v0.4.20 Updating opener v0.5.0 -> v0.5.2 ``` globset BurntSushi/ripgrep@ac8fecb ignore https://github.com/BurntSushi/ripgrep/commits/master/crates/ignore hard to track, but drop dep on crossbeam-utils (BurntSushi/ripgrep@e95254a), don't stat git if require_git is false (BurntSushi/ripgrep@009dda1) and added bunch of formats to ignore list opener Seeker14491/opener@v0.5.0...v0.5.2 nothing interesting

deps: bump crates Updates few deps: drops a lot of cxx* crates: ```console $ cargo update -p iana-time-zone-haiku Updating crates.io index Updating cc v1.0.77 -> v1.0.79 Removing codespan-reporting v0.11.1 Removing cxx v1.0.94 Removing cxx-build v1.0.94 Removing cxxbridge-flags v1.0.94 Removing cxxbridge-macro v1.0.94 Updating iana-time-zone-haiku v0.1.1 -> v0.1.2 Removing link-cplusplus v1.0.8 Removing scratch v1.0.5 ``` cc: https://github.com/rust-lang/cc-rs/releases/tag/1.0.78, https://github.com/rust-lang/cc-rs/releases/tag/1.0.79 iana-time-zone-haiku: https://github.com/strawlab/iana-time-zone/releases/tag/haiku%2Fv0.1.2 fixed crossbeam-rs/crossbeam#972 (similar fixed in rust repo rust-lang/rust#110089) ```console $ cargo update -p crossbeam-channel Updating crates.io index Updating crossbeam-channel v0.5.6 -> v0.5.8 ``` https://github.com/crossbeam-rs/crossbeam/blob/master/crossbeam-channel/CHANGELOG.md#version-058 dedupes memoffset versions: ```console $ cargo update -p crossbeam-epoch Updating crates.io index Updating crossbeam-epoch v0.9.13 -> v0.9.14 Removing memoffset v0.7.1 ``` https://github.com/crossbeam-rs/crossbeam/blob/master/crossbeam-epoch/CHANGELOG.md#version-0914 Gilnaa/memoffset@v0.6.5...v0.8.0 rust-lang/rust#108638 (comment) dedupes bstr versions ```console $ cargo update -p ignore -p opener Updating crates.io index Removing bstr v0.2.17 Updating globset v0.4.9 -> v0.4.10 Updating ignore v0.4.18 -> v0.4.20 Updating opener v0.5.0 -> v0.5.2 ``` globset BurntSushi/ripgrep@ac8fecb ignore https://github.com/BurntSushi/ripgrep/commits/master/crates/ignore hard to track, but drop dep on crossbeam-utils (BurntSushi/ripgrep@e95254a), don't stat git if require_git is false (BurntSushi/ripgrep@009dda1) and added bunch of formats to ignore list opener Seeker14491/opener@v0.5.0...v0.5.2 nothing interesting

rustbot assigned cjgillot Mar 2, 2023

Zoxc force-pushed the erase-query-values-map branch from 79f8763 to 4b8bcb6 Compare March 2, 2023 04:37

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 2, 2023

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Mar 2, 2023

WaffleLapkin reviewed Mar 2, 2023

View reviewed changes

compiler/rustc_middle/src/query/erase.rs Outdated Show resolved Hide resolved

compiler/rustc_middle/src/query/erase.rs Outdated Show resolved Hide resolved

Zoxc mentioned this pull request Mar 3, 2023

Create a query cache for DefId #108649

Closed

Zoxc force-pushed the erase-query-values-map branch 2 times, most recently from 061d6ef to 544955a Compare March 4, 2023 14:38

rustbot added the A-testsuite Area: The testsuite used to check the correctness of rustc label Mar 4, 2023

cjgillot reviewed Mar 7, 2023

View reviewed changes

cjgillot mentioned this pull request Mar 7, 2023

Erase query result types using TAIT #108881

Closed

This was referenced Mar 9, 2023

Move query accessor code into functions #107802

Closed

Tweak query accessors #108478

Closed

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 30, 2023

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 3, 2023

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 12, 2023

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 14, 2023

klensy reviewed May 14, 2023

View reviewed changes

bors added the merged-by-bors This PR was explicitly merged by bors. label May 14, 2023

bors merged commit 8e8116c into rust-lang:master May 14, 2023

rustbot added this to the 1.71.0 milestone May 14, 2023

bors mentioned this pull request May 14, 2023

Shorten backtraces for queries in ICEs #108938

Merged

Zoxc mentioned this pull request May 14, 2023

Specialize query execution for incremental and non-incremental #108062

Merged

Zoxc deleted the erase-query-values-map branch May 15, 2023 17:42

Mark-Simulacrum added the perf-regression-triaged The performance regression has been triaged. label May 16, 2023

jyn514 mentioned this pull request May 24, 2023

The rustc_query_impl crate is too big, which hurts compile times for the compiler itself #65031

Open

klensy mentioned this pull request May 26, 2023

deps: bump crates #111989

Merged

klensy reviewed May 27, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use dynamic dispatch for queries #108638

Use dynamic dispatch for queries #108638

Zoxc commented Mar 2, 2023

rustbot commented Mar 2, 2023

Zoxc commented Mar 2, 2023

Noratrieb commented Mar 2, 2023

This comment has been minimized.

bors commented Mar 2, 2023

bors commented Mar 2, 2023

This comment has been minimized.

rust-timer commented Mar 2, 2023

Zoxc commented Mar 2, 2023

bors commented Mar 7, 2023

cjgillot left a comment

cjgillot Mar 7, 2023

Zoxc Mar 9, 2023

cjgillot Mar 7, 2023

Zoxc commented Mar 9, 2023

rust-timer commented Apr 30, 2023

apiraino commented May 3, 2023

Zoxc commented May 12, 2023

cjgillot commented May 14, 2023

bors commented May 14, 2023

bors commented May 14, 2023

klensy May 14, 2023

bors commented May 14, 2023

rust-timer commented May 14, 2023

klensy May 27, 2023 •

edited

Loading

Zoxc May 27, 2023

klensy May 27, 2023 •

edited

Loading

Zoxc May 27, 2023

		memoffset = { version = "0.6.0", features = ["unstable_const"] }
		field-offset = "0.3.5"

Use dynamic dispatch for queries #108638

Use dynamic dispatch for queries #108638

Conversation

Zoxc commented Mar 2, 2023

rustbot commented Mar 2, 2023

Zoxc commented Mar 2, 2023

Noratrieb commented Mar 2, 2023

This comment has been minimized.

bors commented Mar 2, 2023

bors commented Mar 2, 2023

This comment has been minimized.

rust-timer commented Mar 2, 2023

Overall result: ❌ regressions - ACTION NEEDED

Zoxc commented Mar 2, 2023

bors commented Mar 7, 2023

cjgillot left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zoxc commented Mar 9, 2023

rust-timer commented Apr 30, 2023

Overall result: ❌ regressions - ACTION NEEDED

apiraino commented May 3, 2023

Zoxc commented May 12, 2023

cjgillot commented May 14, 2023

bors commented May 14, 2023

bors commented May 14, 2023

Choose a reason for hiding this comment

bors commented May 14, 2023

rust-timer commented May 14, 2023

Overall result: ❌ regressions - ACTION NEEDED

klensy May 27, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

klensy May 27, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

klensy May 27, 2023 •

edited

Loading

klensy May 27, 2023 •

edited

Loading