Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment: Only track fingerprints for queries with reconstructible dep-nodes. #118667

Conversation

michaelwoerister
Copy link
Member

This is an experiment to collect performance data about alternative ways to adapt #109050. The PR makes the following change:

All queries with keys that are not reconstructible from their corresponding DepNode are now treated similar to anonymous queries. That is we don't compute a DepNode or result fingerprint for them.

This has some implications:

  • We save time because query keys and results don't have to be hashed.
  • We can save space storing less data for these nodes in the on-disk dep-graph. (not implemented in this PR as I ran out of time. Maybe this would be a quick fix for @saethlin though?)
  • We don't have to worry about hash collisions for DepNode in these cases (although we still have to worry about hash collisions for result fingerprints, which might include all the same HashStable impls)
  • Same as with anonymous queries, the graph can grow additional nodes and edges in some situations because existing graph parts might be promoted while new parts are allocated for the same query if it is re-executed. I don't know how much this happens in practice.
  • We cannot cache query results for queries with complex keys.

Given that that last point affects some heavy queries, I have my doubts that this strategy is a win. But let's run it through perf at least once.

cc @cjgillot, @Zoxc

r? @ghost

@rustbot rustbot added A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 6, 2023
@rust-log-analyzer

This comment has been minimized.

@michaelwoerister
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 6, 2023
@bors
Copy link
Contributor

bors commented Dec 6, 2023

⌛ Trying commit f2fd56c with merge 139a4ac...

bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 6, 2023
…ingerprints, r=<try>

Experiment: Only track fingerprints for queries with reconstructible dep-nodes.

This is an experiment to collect performance data about alternative ways to adapt rust-lang#109050. The PR makes the following change:

All queries with keys that are not reconstructible from their corresponding DepNode are now treated similar to anonymous queries. That is we don't compute a DepNode or result fingerprint for them.

This has some implications:
- We save time because query keys and results don't have to be hashed.
- We can save space storing less data for these nodes in the on-disk dep-graph. (not implemented in this PR as I ran out of time. Maybe this would be a quick fix for `@saethlin` though?)
- We don't have to worry about hash collisions for DepNode in these cases (although we still have to worry about hash collisions for result fingerprints, which might include all the same HashStable impls)
- Same as with anonymous queries, the graph can grow additional nodes and edges in some situations because existing graph parts might be promoted while new parts are allocated for the same query if it is re-executed. I don't know how much this happens in practice.
- We cannot cache query results for queries with complex keys.

Given that that last point affects some heavy queries, I have my doubts that this strategy is a win. But let's run it through perf at least once.

cc `@cjgillot,` `@Zoxc`

r? `@ghost`
@bors
Copy link
Contributor

bors commented Dec 6, 2023

☀️ Try build successful - checks-actions
Build commit: 139a4ac (139a4acb26220019f321f5893932e9486dbbb47d)

@rust-timer

This comment has been minimized.

@bjorn3
Copy link
Member

bjorn3 commented Dec 6, 2023

We cannot cache query results for queries with complex keys.

Does that also mean the CompileCodegenUnit and CompileMonoItem dep nodes are no longer tracked? Or does this not apply to dep nodes that don't have an explicit query, but are using the tcx.dep_graph.with_task() api? cg_clif passes a quite complex type as argument for the CompileCodegenUnit dep node ((BackendConfig, Arc<GlobalAsmConfig>, Symbol, ConcurrencyLimiterToken)).

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (139a4ac): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
112.8% [0.2%, 1907.4%] 112
Regressions ❌
(secondary)
153.2% [0.3%, 3197.3%] 49
Improvements ✅
(primary)
-1.5% [-3.6%, -0.3%] 44
Improvements ✅
(secondary)
-1.9% [-6.1%, -0.2%] 64
All ❌✅ (primary) 80.6% [-3.6%, 1907.4%] 156

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
16.2% [1.9%, 143.9%] 62
Regressions ❌
(secondary)
8.4% [1.4%, 44.6%] 18
Improvements ✅
(primary)
-4.8% [-21.1%, -0.6%] 55
Improvements ✅
(secondary)
-9.0% [-38.7%, -2.1%] 27
All ❌✅ (primary) 6.3% [-21.1%, 143.9%] 117

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
193.3% [2.8%, 1816.3%] 64
Regressions ❌
(secondary)
242.1% [6.6%, 2938.2%] 29
Improvements ✅
(primary)
-3.5% [-8.7%, -1.7%] 38
Improvements ✅
(secondary)
-5.3% [-10.6%, -2.0%] 28
All ❌✅ (primary) 120.0% [-8.7%, 1816.3%] 102

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 675.083s -> 673.668s (-0.21%)
Artifact size: 314.18 MiB -> 314.02 MiB (-0.05%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Dec 6, 2023
@bjorn3
Copy link
Member

bjorn3 commented Dec 6, 2023

Does that also mean the CompileCodegenUnit and CompileMonoItem dep nodes are no longer tracked?

I think the perf results show that this is indeed the case.

@michaelwoerister
Copy link
Member Author

Does that also mean the CompileCodegenUnit and CompileMonoItem dep nodes are no longer tracked?

Well, I actually wanted to say that these are not affected directly 🙂 Anything that explicitly uses DepGraph::with_task works the same as before. I still think that's true (modulo any bugs I introduced).

What I suspect is happening is that we cannot mark some crucial queries green after re-evaluation anymore. The hypothesis is that, before, a query like symbol_name was "maybe changed" (because one of its inputs was red) but then re-evaluating it yielded the same result as in the previous session, so it would be marked green after all. But now, that query instance cannot be correlated to the previous instance anymore, so the system assumes that it has changed.

This could be solved by making more query keys reconstructible, but that's complicated and I'm not sure it would be worth the trouble.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants