Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The rustc_query_impl crate is too big, which hurts compile times for the compiler itself #65031

Open
nnethercote opened this issue Oct 2, 2019 · 38 comments
Labels
A-contributor-roadblock Area: Makes things more difficult for new contributors to rust itself A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) C-enhancement Category: An issue proposing an enhancement or a PR with one. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@nnethercote
Copy link
Contributor

nnethercote commented Oct 2, 2019

EDIT(Centril): The crate was renamed to rustc_middle in #70536.
EDIT(jyn514): Most of rustc_middle was split into rustc_query_impl in #70951.

The results from @ehuss's new cargo -Ztimings feature really drive home how the size of the rustc crate hurts compile times for the compiler. Here's a picture:
a
Things to note.

  • The rustc crate takes about twice as long to compile as the rustc_mir crate, which is the second-biggest crate.
  • Sixteen dependent crates start and finish their entire compilation while the rustc crate's codegen is happening. (Thank goodness for pipelining, it really makes a difference in this case.)
  • Almost 1/3 of the total build time occurs while building the rustc crate without anything else happening in parallel (from 42-67s, and from 141-174s).

Also, even doing check builds on rustc code is painful. On my fast 28-core Linux box it takes 10-15 seconds even for the first compile error to come up. This is much longer than other crates. Refactorings within the rustc crate that require many edit/compile cycles are painful.

I've been told that rustc is quite tangled and splitting it up could be difficult. That may well be true. The good news is that the picture shows we don't need to split it into 10 equal-sized pieces to get benefits. Even splitting off small chunks into separate crates will have an outsized effect on compilation times.

@petrochenkov
Copy link
Contributor

librustc still has a lot of code that's "just implementation code" and not something infrastructural like types or traits used in the rest of the compiler.

If we look e.g. at rustc/middle: resolve_lifetime.rs - just a pass, can be moved to rustc_resolve, stability.rs - just a pass, can be moved to rustc_passes or something.

@jonas-schievink jonas-schievink added C-enhancement Category: An issue proposing an enhancement or a PR with one. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 3, 2019
@crlf0710
Copy link
Member

crlf0710 commented Oct 4, 2019

On my windows box, crate rustc takes about 2000 seconds to build... Grateful to anything that makes the compilation process faster.

@RalfJung
Copy link
Member

One thing that would be nice to move back from librustc to librustc_mir is all the memory access logic of the Miri interpreter, which these days lives in librustc::mir::interpret::allocation. But @oli-obk moved it there because it is used by things that do not depend on librustc_mir, so I am not sure what it would take to make that happen.

@Mark-Simulacrum
Copy link
Member

Can those things not depend on librustc_mir, e.g., being dependencies of librustc_mir? Can @oli-obk comment on that perhaps?

@oli-obk
Copy link
Contributor

oli-obk commented Oct 18, 2019

Most of the datastructures need to live there, because the incremental serializer lives there (though we could probably move both out) and because some of the interners intern types like Allocation. I'm not sure how much of the code is needed there, but we can't add inherent impls in downstream crates, so if we want inherent methods, we need to implement them there.

I've tried a few times to split things out of librustc, it's just so entangled with everything that I failed.

@RalfJung
Copy link
Member

but we can't add inherent impls in downstream crates, so if we want inherent methods, we need to implement them there.

If it's just that we could use extension traits.

Centril added a commit to Centril/rust that referenced this issue Nov 23, 2019
Move def collector from `rustc` to `rustc_resolve`

It's used only from `rustc_resolve`, so we can move it there, thus reducing the size of `rustc` (rust-lang#65031).

It's quite possible that we can merge the def collector pass into the "build reduced graph" pass (they are always run together and do similar things), but it's some larger work.

r? @eddyb
@Zoxc
Copy link
Contributor

Zoxc commented Dec 24, 2019

The way we should structure the compiler for compilation speed would be to have n independent crates defining types basically working as header files. Then we'd have a librustc_queries crates which just declares queries which depends on the n header crates. After that we'd fan out to m independent crates containing query definitions and other code.

Currently n is 1 (libsyntax) and way too much stuff is in librustc which prevents the m crates from starting sooner. I think it would make sense to move types from librustc into crates which librustc can depend on and code from librustc into crates which depends on librustc.

I see that @Centril is kind of doing this already, but I think it would be a good idea to gain some compiler team sign-off here.

The thing preventing moves of types tend to be inherent utility methods which require TyCtxt. Maybe we could move these to extension traits instead? (with maybe a prelude for these?) One goal could be to remove all uses of TyCtxt in librustc except for the query system and then extract the query system into librustc_queries.

cc @rust-lang/compiler

@Zoxc
Copy link
Contributor

Zoxc commented Dec 24, 2019

Another way to deal with TyCtxt would be to have a QueryCtxt trait in each crate and use impl QueryCtxt instead of TyCtxt for the utility methods in these crates. Maybe we could even remove the dependency on librustc_queries for some crates that way?

@Zoxc
Copy link
Contributor

Zoxc commented Dec 24, 2019

I ran -Ztimings too on Windows: Results.

It seems like rustc_driver is taking very long here, probably linker related. We can however see that libsyntax is much shorter after @Centril's changes.

Here is also a check build.

02.02.2020 timings.
27.03.2020 timings.

@cjgillot
Copy link
Contributor

Please let me know if you need help in this endeavour.

Centril added a commit to Centril/rust that referenced this issue Dec 31, 2019
…imulacrum

Extract `rustc_ast_lowering` crate from `rustc`

Working towards rust-lang#65031.

This PR moves `src/librustc/hir/lowering{/, .rs}` to its own crate (`librustc_ast_lowering`) which is very self-contained (only `fn lower_crate` and `trait Resolver` are exposed).

r? @Mark-Simulacrum
bors added a commit that referenced this issue Jan 4, 2020
Extract `rustc_hir` out of `rustc`

The new crate contains:
```rust
pub mod def;
pub mod def_id;
mod hir;
pub mod hir_id;
pub mod itemlikevisit;
pub mod pat_util;
pub mod print;
mod stable_hash_impls;

pub use hir::*;
pub use hir_id::*;
pub use stable_hash_impls::HashStableContext;
```

Remains to be done in follow-up PRs:

- Move `rustc::hir::map` into `rustc_hir_map` -- this has to be a separate crate due to the `dep_graph` (blocked on #67761).

- Move references to `rustc::hir` to `rustc_hir` where possible.

cc #65031

r? @Zoxc
@jyn514 jyn514 changed the title The rustc_middle crate is too big, which hurts compile times for the compiler itself The rustc_query_impl crate is too big, which hurts compile times for the compiler itself Dec 27, 2022
bors added a commit to rust-lang-ci/rust that referenced this issue Dec 27, 2022
Allow cleaning individual crates

As a bonus, this stops special casing `clean` in `Builder`.

## Motivation

Cleaning artifacts isn't strictly necessary to get cargo to rebuild; `touch compiler/rustc_driver/src/lib.rs` (for example) will also work. There's two reasons I thought making this part of bootstrap proper was a better approach:
1. `touch` does not *remove* artifacts, it just causes a rebuild. This is unhelpful for when you want to measure how long the compiler itself takes to build (e.g. for rust-lang#65031).
2. It seems a little more discoverable; and I want to extend it in the future to things like `x clean --stage 1 rustc`, which makes it easier to work around rust-lang#76720 without having to completely wipe all the stage 0 artifacts, or having to be intimately familiar with which directories to remove.
@jyn514
Copy link
Member

jyn514 commented Dec 29, 2022

Here is the output of -Zdump-mono-items on query_impl (forked to output JSON instead of markdown): https://gist.github.com/jyn514/e20a1a4c542770d855240169c4a85331

There are some very surprising things here:

$ jq 'sort_by(.total_estimate) | .[-10:] | .[] | { name, total_estimate, instantiation_count }' rustc_query_impl.mono_items.json -c
{"name":"<plumbing::QueryCtxt<'_> as rustc_query_system::query::QueryContext>::start_query::{closure#0}","total_estimate":75240,"instantiation_count":684}
{"name":"std::ptr::mut_ptr::<impl *mut T>::is_null","total_estimate":77119,"instantiation_count":3353}
{"name":"std::ptr::const_ptr::<impl *const T>::is_aligned_to","total_estimate":79212,"instantiation_count":1722}
{"name":"plumbing::force_from_dep_node","total_estimate":84360,"instantiation_count":285}
{"name":"rustc_middle::ty::tls::with_context_opt","total_estimate":84747,"instantiation_count":1599}
{"name":"rustc_query_system::query::plumbing::execute_job","total_estimate":85728,"instantiation_count":228}
{"name":"rustc_query_system::dep_graph::DepGraph::<K>::with_task_impl","total_estimate":89148,"instantiation_count":228}
{"name":"rustc_query_system::query::plumbing::try_load_from_disk_and_cache_in_memory","total_estimate":102372,"instantiation_count":228}
{"name":"hashbrown::raw::RawTable::<T, A>::reserve_rehash","total_estimate":111354,"instantiation_count":277}
{"name":"std::thread::LocalKey::<T>::try_with","total_estimate":186412,"instantiation_count":3214}
  • LocalKey is the highest impact to compile times; we should investigate using some other kind of thread-local storage (maybe #[thread_local]?) instead.
  • start_query is instantiated too many times; it should only be once per query, but it's being instantiated 3 times for each

bors added a commit to rust-lang-ci/rust that referenced this issue Dec 31, 2022
…compile-time, r=thomcc

Use some more `const_eval_select` in pointer methods for compile times

Builds on top of rust-lang#105435

`is_aligned_to` is _huge_ with calling `align_offset`, so this should cut it down a lot.

This shows up in rust-lang#65031 (comment)
RalfJung pushed a commit to RalfJung/miri that referenced this issue Jan 3, 2023
…ime, r=thomcc

Use some more `const_eval_select` in pointer methods for compile times

Builds on top of #105435

`is_aligned_to` is _huge_ with calling `align_offset`, so this should cut it down a lot.

This shows up in rust-lang/rust#65031 (comment)
@jyn514
Copy link
Member

jyn514 commented May 24, 2023

Since #108638 rustc_query_impl takes significantly less time to compile and is no longer the limiting factor, at least for my setup:
image

I'm going to close this issue as completed - there's always more perf work to do, but I think query_impl is now fast enough we don't have to single it out :)

Thank you to everyone who worked on this!

@jyn514 jyn514 closed this as completed May 24, 2023
@RalfJung
Copy link
Member

RalfJung commented May 25, 2023 via email

@jyn514
Copy link
Member

jyn514 commented May 25, 2023

Yes, but in a different way than before - now the bottleneck is generating metadata for rustc_middle, not performing codegen.

@Zoxc
Copy link
Contributor

Zoxc commented May 25, 2023

@jyn514 Could you try a build with rust.codegen-units = 1? rustc_query_impl is still holding things up slightly for me.

@jyn514
Copy link
Member

jyn514 commented May 25, 2023

Hmm, do you have codegen-units=1 set for local benchmarking or some other reason?

Here's the graph with 1 CGU - I do see query_impl causing a slight delay now.
image

@Zoxc
Copy link
Contributor

Zoxc commented May 25, 2023

It's for benchmarking and LLVM IR inspection purposes. The total build time is similar for 16 and 1 CGUs for me on my Ryzen 7 1700. What are you building on? There seems to be a bit of a divergence in build behavior.

@jyn514
Copy link
Member

jyn514 commented May 26, 2023

I don't have time in the near future to investigate the difference, happy to reopen in the meantime.

@jyn514 jyn514 reopened this May 26, 2023
@Zoxc
Copy link
Contributor

Zoxc commented May 27, 2023

@jyn514 Did you build with rust.lto set to thin-local or off? I built with thin-local, and off is the new compiler profile default, so that could explain the divergence.

@jyn514
Copy link
Member

jyn514 commented May 27, 2023

ahhhh - yes, that's likely it.

thomcc pushed a commit to tcdi/postgrestd that referenced this issue May 31, 2023
…ime, r=thomcc

Use some more `const_eval_select` in pointer methods for compile times

Builds on top of #105435

`is_aligned_to` is _huge_ with calling `align_offset`, so this should cut it down a lot.

This shows up in rust-lang/rust#65031 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-contributor-roadblock Area: Makes things more difficult for new contributors to rust itself A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) C-enhancement Category: An issue proposing an enhancement or a PR with one. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests