-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge collect_mod_item_types
query into check_well_formed
#121500
Conversation
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Merge `collect_mod_item_types` query into `check_well_formed` follow-up to rust-lang#121154 r? `@ghost`
This comment has been minimized.
This comment has been minimized.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (f7e3a77): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)This benchmark run did not return any relevant results for this metric. CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 650.894s -> 650.065s (-0.13%) |
@bors try @rust-timer queue r? compiler |
This comment has been minimized.
This comment has been minimized.
Merge `collect_mod_item_types` query into `check_well_formed` follow-up to rust-lang#121154 this removes more potential parallel-compiler bottlenecks and moves diagnostics for the same items next to each other, instead of grouping diagnostics by analysis kind
&& let Some(def_id) = frame.query.def_id | ||
{ | ||
let n = tcx.generics_of(def_id).params.len(); | ||
vec![ty::Variance::Bivariant; n].leak() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain what happens here?
When will this impl be called? Is this logic for error recovery?
(Also tcx.arena.allocate
could probably be used instead of leak
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When will this impl be called? Is this logic for error recovery?
This is not part of recovery, but of cycle_delay_bug
. In order to hide cycle errors in favor of more useful errors, we can make the query cycle turn into a span_delayed_bug
. But then we need a value to return from the query, which is where all the impls in this module are invoked. They produce a dummy value for the query.
(Also
tcx.arena.allocate
could probably be used instead ofleak
.)
Then I would have to transmute the result 😆
I guess if someone is running rustc in a loop from within the same process, and keep hitting this cycle error, they would slowly lose memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
edit: leak is already being used elsewhere in this file, so this is more of a general point that we could figure out, but it seems low priority, so I think just avoiding the unsafe code is best
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not part of recovery
I mean ErrorGuaranteed
is passed to this function, so some error already happened (or at least was stashed)?
So this dummy value is never created on a good path.
Edit: the bit about span_delayed_bug
is also not clear, if we have an ErrorGuaranteed
then the delayed bug will never be reported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
span_delayed_bug
creates an ErrorGuaranteed
without emitting an error. It will ICE the compiler if you forgot to emit an error later
}); | ||
let _ = items.par_foreign_items(|item| { | ||
Ok(CollectItemTypesVisitor { tcx }.visit_foreign_item(hir.foreign_item(item))) | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I'd expect having something like par_visit_item_likes_in_module
instead of doing things like this here or in wf checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That can't be done nicely, because visit_item_likes_in_module
uses a visitor. I guess we could add something that creates the visitor 4 times just like I did here, but considering after this PR we again have only a single use site of such a par_visit_item_likes_in_module
method, keeping it inline at the one use site seems best to me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
considering after this PR we again have only a single use site of such a
par_visit_item_likes_in_module
method
If overhead of parallel queries was low enough, then I'd expect pretty much every visit_item_likes_in_module
s and visit_all_item_likes_in_crate
s to benefit from being turned into its par_
version.
(Although maybe we'll need a special Par
version of visitor for this.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add it to the list of things I am investigating around the parallel compiler
This comment has been minimized.
This comment has been minimized.
No requests, besides the general questions I was also waiting for perf results from #121500 (comment), that perf run is still in the queue. |
whoops, force pushes for clippy ui tests cancelled rust-timer, but not the try build, so reusing that build @rust-timer build 1f106e7 |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (1f106e7): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)This benchmark run did not return any relevant results for this metric. CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 648.501s -> 647.819s (-0.11%) |
@bors r+ |
Merge `collect_mod_item_types` query into `check_well_formed` follow-up to rust-lang#121154 this removes more potential parallel-compiler bottlenecks and moves diagnostics for the same items next to each other, instead of grouping diagnostics by analysis kind
💔 Test failed - checks-actions |
A job failed! Check out the build log: (web) (plain) Click to see the possible cause of the failure (guessed by this bot)
|
@bors retry curl: (28) Operation too slow. Less than 100 bytes/sec transferred the last 5 seconds |
☀️ Test successful - checks-actions |
Run a single huge par_body_owners instead of many small ones after each other. This improves parallel rustc parallelism by avoiding the bottleneck after each individual `par_body_owners` (because it needs to wait for queries to finish, so if there is one long running one, a lot of cores will be idle while waiting for the single query). based on rust-lang#121500
Finished benchmarking commit (74acabe): comparison URL. Overall result: ✅ improvements - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 649.016s -> 648.792s (-0.03%) |
follow-up to #121154
this removes more potential parallel-compiler bottlenecks and moves diagnostics for the same items next to each other, instead of grouping diagnostics by analysis kind