Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long compile time due to excessively long translation passes #42941

Closed
jonhoo opened this issue Jun 27, 2017 · 12 comments
Closed

Long compile time due to excessively long translation passes #42941

jonhoo opened this issue Jun 27, 2017 · 12 comments
Labels
I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@jonhoo
Copy link
Contributor

jonhoo commented Jun 27, 2017

Following up from #41696 (comment), I'm seeing compile times in excess of 13 minutes for a relatively simple program. In particular, running cargo test --lib --no-run on jonhoo/fantoccini@0802a57 takes ~925s on my laptop, as does even a simple program that uses the library in a similar way as the tests. The output of -Z time-passes is

time: 0.007; rss: 55MB  parsing
time: 0.000; rss: 55MB  recursion limit
time: 0.000; rss: 55MB  crate injection
time: 0.000; rss: 55MB  plugin loading
time: 0.000; rss: 55MB  plugin registration
time: 0.099; rss: 112MB expansion
time: 0.003; rss: 112MB maybe building test harness
time: 0.000; rss: 112MB maybe creating a macro crate
time: 0.000; rss: 112MB checking for inline asm in case the target doesn't support it
time: 0.001; rss: 112MB early lint checks
time: 0.000; rss: 112MB AST validation
time: 0.011; rss: 120MB name resolution
time: 0.002; rss: 120MB complete gated feature checking
time: 0.004; rss: 122MB lowering ast -> hir
time: 0.001; rss: 122MB indexing hir
time: 0.000; rss: 122MB attribute checking
time: 0.000; rss: 122MB language item collection
time: 0.000; rss: 122MB lifetime resolution
time: 0.000; rss: 122MB looking for entry point
time: 0.000; rss: 122MB looking for plugin registrar
time: 0.000; rss: 122MB loop checking
time: 0.000; rss: 122MB static item recursion checking
time: 0.005; rss: 122MB compute_incremental_hashes_map
time: 0.000; rss: 122MB load_dep_graph
time: 0.000; rss: 122MB stability index
time: 0.001; rss: 122MB stability checking
time: 0.009; rss: 132MB type collecting
time: 0.000; rss: 132MB impl wf inference
time: 0.031; rss: 154MB coherence checking
time: 0.000; rss: 154MB variance testing
time: 0.006; rss: 154MB wf checking
time: 0.006; rss: 158MB item-types checking
time: 0.205; rss: 164MB item-bodies checking
time: 0.021; rss: 168MB const checking
time: 0.001; rss: 168MB privacy checking
time: 0.000; rss: 168MB intrinsic checking
time: 0.000; rss: 168MB effect checking
time: 0.003; rss: 168MB match checking
time: 0.001; rss: 168MB liveness checking
time: 0.070; rss: 174MB borrow checking
time: 0.000; rss: 174MB reachability checking
time: 0.001; rss: 174MB death checking
time: 0.000; rss: 174MB unused lib feature checking
time: 0.014; rss: 174MB lint checking
time: 0.000; rss: 174MB resolving dependency formats
  time: 0.001; rss: 174MB       write metadata
  time: 495.029; rss: 206MB     translation item collection
  time: 0.012; rss: 214MB       codegen unit partitioning
  time: 0.007; rss: 337MB       internalize symbols
time: 807.783; rss: 337MB       translation
time: 0.000; rss: 337MB assert dep graph
time: 0.000; rss: 337MB serialize dep graph
  time: 0.002; rss: 324MB       codegen passes [0]
  time: 0.205; rss: 326MB       llvm function passes [1]
  time: 0.129; rss: 330MB       llvm module passes [1]
  time: 4.515; rss: 336MB       codegen passes [1]
time: 4.852; rss: 332MB LLVM passes
time: 0.000; rss: 332MB serialize work products
  time: 0.682; rss: 245MB       running linker
time: 0.685; rss: 245MB linking

The gap in

  time: 0.001; rss: 174MB       write metadata
  time: 495.029; rss: 206MB     translation item collection
  time: 0.012; rss: 214MB       codegen unit partitioning
  time: 0.007; rss: 337MB       internalize symbols
time: 807.783; rss: 337MB       translation

is pretty curious. The extra time is spent between when codegen unit partitioning is printed and when internalize symbols is printed, but not sure if that helps?

@bstrie bstrie added I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 27, 2017
@michaelwoerister
Copy link
Member

The extra time is spent between when codegen unit partitioning is printed and when internalize symbols is printed.

This is where all the "actual" translation, i.e. MIR to LLVM, happens.

But translation item collection is also taking very long. cc @rust-lang/compiler

@jonhoo
Copy link
Contributor Author

jonhoo commented Jun 27, 2017

@bstrie shouldn't this be I-compiletime, not I-slow? Or both?

@michaelwoerister michaelwoerister added the I-compiletime Issue: Problems and improvements with respect to compile times. label Jun 27, 2017
@Mark-Simulacrum Mark-Simulacrum removed the I-slow Issue: Problems and improvements with respect to performance of generated code. label Jun 27, 2017
@jonhoo
Copy link
Contributor Author

jonhoo commented Jun 30, 2017

A much shorter replicating example (taken from #41696 (comment)) is:

extern crate futures;

use futures::{future, IntoFuture, Future};

fn main() {
    let t: std::result::Result<(), ()> = Ok(());
    let f = t
            .into_future()
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()));
    f.wait();
}

The code above takes ~750s to compile on my laptop (you can make it shorter/longer by removing/adding .and_then calls). Output from cargo rustc -- -Z time-passes is similar to the original post:

time: 0.000; rss: 53MB  parsing
time: 0.000; rss: 53MB  recursion limit
time: 0.000; rss: 53MB  crate injection
time: 0.000; rss: 53MB  plugin loading
time: 0.000; rss: 53MB  plugin registration
time: 0.027; rss: 80MB  expansion
time: 0.000; rss: 80MB  maybe building test harness
time: 0.000; rss: 84MB  maybe creating a macro crate
time: 0.000; rss: 84MB  checking for inline asm in case the target doesn't support it
time: 0.000; rss: 84MB  early lint checks
time: 0.000; rss: 84MB  AST validation
time: 0.002; rss: 84MB  name resolution
time: 0.000; rss: 84MB  complete gated feature checking
time: 0.000; rss: 84MB  lowering ast -> hir
time: 0.000; rss: 84MB  indexing hir
time: 0.000; rss: 84MB  attribute checking
time: 0.000; rss: 84MB  language item collection
time: 0.000; rss: 84MB  lifetime resolution
time: 0.000; rss: 84MB  looking for entry point
time: 0.000; rss: 84MB  looking for plugin registrar
time: 0.000; rss: 84MB  loop checking
time: 0.000; rss: 84MB  static item recursion checking
time: 0.000; rss: 84MB  compute_incremental_hashes_map
time: 0.000; rss: 84MB  load_dep_graph
time: 0.000; rss: 84MB  stability index
time: 0.000; rss: 84MB  stability checking
time: 0.000; rss: 84MB  type collecting
time: 0.000; rss: 84MB  impl wf inference
time: 0.000; rss: 84MB  coherence checking
time: 0.000; rss: 84MB  variance testing
time: 0.000; rss: 84MB  wf checking
time: 0.000; rss: 84MB  item-types checking
time: 0.034; rss: 97MB  item-bodies checking
time: 0.001; rss: 97MB  const checking
time: 0.000; rss: 97MB  privacy checking
time: 0.000; rss: 97MB  intrinsic checking
time: 0.000; rss: 97MB  effect checking
time: 0.000; rss: 97MB  match checking
time: 0.000; rss: 97MB  liveness checking
time: 18.628; rss: 104MB        borrow checking
time: 0.000; rss: 104MB reachability checking
time: 0.000; rss: 104MB death checking
time: 0.000; rss: 104MB unused lib feature checking
time: 0.005; rss: 108MB lint checking
time: 0.000; rss: 108MB resolving dependency formats
  time: 0.000; rss: 108MB       write metadata
  time: 472.840; rss: 110MB     translation item collection
  time: 0.001; rss: 112MB       codegen unit partitioning
  time: 0.001; rss: 133MB       internalize symbols
time: 732.741; rss: 133MB       translation
time: 0.000; rss: 133MB assert dep graph
time: 0.000; rss: 133MB serialize dep graph
  time: 0.002; rss: 148MB       codegen passes [0]
  time: 0.014; rss: 146MB       llvm function passes [1]
  time: 0.005; rss: 146MB       llvm module passes [1]
  time: 0.382; rss: 155MB       codegen passes [1]
time: 0.404; rss: 148MB LLVM passes
time: 0.000; rss: 148MB serialize work products
  time: 0.112; rss: 143MB       running linker
time: 0.113; rss: 143MB linking

The extra ~250s is again spent between codegen unit partitioning and internalize symbols.

Also maybe cc @arielb1 ?

@eddyb
Copy link
Member

eddyb commented Jun 30, 2017

@jonhoo The extra 250 seconds you speak of is the translation to LLVM IR itself. Maybe we should make that list less confusing...

@jonhoo
Copy link
Contributor Author

jonhoo commented Jun 30, 2017

@eddyb yeah, @michaelwoerister mentioned that above, I just figured it was worth repeating closer to the timing output. It's probably a good idea to have -Z time-passes include timing information for the actual LLVM IR translation too.

@jonhoo
Copy link
Contributor Author

jonhoo commented Jul 2, 2017

A few more data-points:

  • Issue is present on all stable versions as far back as futures can still be built.
  • Incremental compilation seems to be irrelevant to triggering the issue.
  • There seems to be a (possibly unrelated) regression from stable to nightly here as well for the borrow checking phase:
$ cargo +1.18.0 rustc -- -Z time-passes
...
time: 0.000; rss: 90MB  borrow checking
...
$ cargo +nightly rustc -- -Z time-passes
...
time: 18.628; rss: 104MB        borrow checking
...

@eddyb is this worth nominating given that hyper 0.11 uses futures everywhere, which will likely cause many more programs to use chains of futures going forward?

@sfackler
Copy link
Member

sfackler commented Jul 2, 2017

This looks like a duplicate of #38528?

@jonhoo
Copy link
Contributor Author

jonhoo commented Jul 2, 2017

@sfackler yeah, that seems reasonable.

@Mark-Simulacrum
Copy link
Member

Nominated the other issue and closing.

@jonhoo
Copy link
Contributor Author

jonhoo commented Jul 2, 2017

@sfackler thoughts on the borrow checking increase from stable to nightly in #42941 (comment) ? Worth creating a new issue?

@Mark-Simulacrum
Copy link
Member

It's probably the MIR borrowck work -- @pnkfelix is aware of the problem, I believe. I'm not sure if we have an issue...

@jonhoo
Copy link
Contributor Author

jonhoo commented Jul 2, 2017

Okay, filed #43018. @pnkfelix feel free to close if there is already another issue (I couldn't find one).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants