Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long compilation time of crate "unicode-normalization" #55528

Closed
jens1o opened this issue Oct 31, 2018 · 10 comments
Closed

Long compilation time of crate "unicode-normalization" #55528

jens1o opened this issue Oct 31, 2018 · 10 comments
Labels
I-compiletime Issue: Problems and improvements with respect to compile times.

Comments

@jens1o
Copy link
Contributor

jens1o commented Oct 31, 2018

Hey folks,

I was told to create an issue here. When compiling crates, I often see this crate holding up the overall process, and because others are depending on it, it grabs one cpu core and the others are literally idle and I need to wait.

Yeah this contains some giant match statements which are slow. We could perhaps use something like phf_map here but that may not improve compile times.

-- @Manishearth

I'd be thankful if there are some efforts to make it faster. :)

ref unicode-rs/unicode-normalization#29

@matthiaskrgr
Copy link
Member

item-bodies checking and match checking take a lot of time:
RUSTFLAGS="-Z time-passes -Ccodegen-units=1" cargo check

   Checking unicode-normalization v0.1.7 (/tmp/unicode-normalization)
  time: 0.083; rss: 67MB    parsing
  time: 0.000; rss: 67MB    attributes injection
  time: 0.000; rss: 68MB    garbage collect incremental cache directory
  time: 0.000; rss: 68MB    recursion limit
  time: 0.000; rss: 68MB    crate injection
  time: 0.000; rss: 68MB    plugin loading
  time: 0.000; rss: 68MB    plugin registration
  time: 0.000; rss: 68MB    background load prev dep-graph
  time: 0.006; rss: 68MB    pre ast expansion lint checks
    time: 0.032; rss: 86MB  expand crate
    time: 0.000; rss: 86MB  check unused macros
  time: 0.032; rss: 86MB    expansion
  time: 0.000; rss: 86MB    maybe building test harness
  time: 0.001; rss: 86MB    maybe creating a macro crate
  time: 0.004; rss: 88MB    creating allocators
  time: 0.003; rss: 88MB    AST validation
  time: 0.010; rss: 91MB    name resolution
  time: 0.001; rss: 91MB    complete gated feature checking
  time: 0.000; rss: 91MB    blocked while dep-graph loading finishes
  time: 0.016; rss: 102MB   lowering ast -> hir
  time: 0.009; rss: 102MB   early lint checks
  time: 0.020; rss: 110MB   indexing hir
  time: 0.000; rss: 96MB    load query result cache
  time: 0.000; rss: 96MB    looking for entry point
  time: 0.000; rss: 96MB    looking for plugin registrar
  time: 0.001; rss: 96MB    loop checking
  time: 0.002; rss: 102MB   attribute checking
  time: 0.005; rss: 102MB   stability checking
  time: 0.018; rss: 121MB   type collecting
  time: 0.000; rss: 121MB   outlives testing
  time: 0.000; rss: 121MB   impl wf inference
  time: 0.042; rss: 129MB   coherence checking
  time: 0.000; rss: 129MB   variance testing
  time: 0.035; rss: 136MB   wf checking
  time: 0.007; rss: 138MB   item-types checking
  time: 12.114; rss: 153MB  item-bodies checking
  time: 0.035; rss: 154MB   rvalue promotion
  time: 0.016; rss: 154MB   privacy checking
  time: 0.002; rss: 154MB   intrinsic checking
  time: 9.746; rss: 158MB   match checking
  time: 0.003; rss: 158MB   liveness checking
  time: 0.364; rss: 181MB   borrow checking
  time: 0.000; rss: 181MB   MIR borrow checking
  time: 0.000; rss: 181MB   dumping chalk-like clauses
  time: 0.000; rss: 181MB   MIR effect checking
  time: 0.004; rss: 181MB   death checking
  time: 0.001; rss: 181MB   unused lib feature checking
  time: 0.012; rss: 181MB   lint checking
  time: 0.000; rss: 181MB   resolving dependency formats
    time: 0.004; rss: 182MB write metadata
    time: 0.000; rss: 182MB assert dep graph
            time: 0.000; rss: 183MB encode_query_results for rustc::ty::queries::type_of
            time: 0.000; rss: 183MB encode_query_results for rustc::ty::queries::generics_of
            time: 0.000; rss: 183MB encode_query_results for rustc::ty::queries::predicates_of
            time: 0.000; rss: 183MB encode_query_results for rustc::ty::queries::used_trait_imports
            time: 0.004; rss: 183MB encode_query_results for rustc::ty::queries::typeck_tables_of
            time: 0.000; rss: 183MB encode_query_results for rustc::ty::queries::codegen_fulfill_obligation
            time: 0.000; rss: 183MB encode_query_results for rustc::ty::queries::optimized_mir
            time: 0.000; rss: 183MB encode_query_results for rustc::ty::queries::unsafety_check_result
            time: 0.000; rss: 183MB encode_query_results for rustc::ty::queries::borrowck
            time: 0.000; rss: 183MB encode_query_results for rustc::ty::queries::mir_borrowck
            time: 0.000; rss: 183MB encode_query_results for rustc::ty::queries::mir_const_qualif
            time: 0.000; rss: 183MB encode_query_results for rustc::ty::queries::def_symbol_name
            time: 0.000; rss: 183MB encode_query_results for rustc::ty::queries::const_is_rvalue_promotable_to_static
            time: 0.000; rss: 183MB encode_query_results for rustc::ty::queries::symbol_name
            time: 0.000; rss: 183MB encode_query_results for rustc::ty::queries::check_match
            time: 0.000; rss: 183MB encode_query_results for rustc::ty::queries::codegen_fn_attrs
            time: 0.000; rss: 184MB encode_query_results for rustc::ty::queries::specialization_graph_of
          time: 0.005; rss: 184MB   encode query results
        time: 0.007; rss: 184MB serialize query result cache
      time: 0.007; rss: 184MB   persist query result cache
          time: 0.001; rss: 185MB   getting serialized graph
          time: 0.001; rss: 185MB   encoding serialized graph
        time: 0.002; rss: 185MB encode dep-graph
      time: 0.003; rss: 185MB   persist dep-graph
    time: 0.010; rss: 185MB serialize dep graph
  time: 0.015; rss: 185MB   codegen
  time: 0.000; rss: 118MB   serialize work products
  time: 0.000; rss: 118MB   linking
    Finished dev [unoptimized + debuginfo] target(s) in 22.94s

building:
RUSTFLAGS="-Z time-passes -Ccodegen-units=1" cargo build --release

   Compiling unicode-normalization v0.1.7 (/tmp/unicode-normalization)
  time: 0.070; rss: 67MB    parsing
  time: 0.000; rss: 67MB    attributes injection
  time: 0.000; rss: 67MB    recursion limit
  time: 0.000; rss: 67MB    crate injection
  time: 0.000; rss: 67MB    plugin loading
  time: 0.000; rss: 67MB    plugin registration
  time: 0.006; rss: 68MB    pre ast expansion lint checks
    time: 0.032; rss: 86MB  expand crate
    time: 0.000; rss: 86MB  check unused macros
  time: 0.032; rss: 86MB    expansion
  time: 0.000; rss: 86MB    maybe building test harness
  time: 0.001; rss: 86MB    maybe creating a macro crate
  time: 0.004; rss: 87MB    creating allocators
  time: 0.002; rss: 87MB    AST validation
  time: 0.011; rss: 90MB    name resolution
  time: 0.001; rss: 90MB    complete gated feature checking
  time: 0.017; rss: 101MB   lowering ast -> hir
  time: 0.011; rss: 101MB   early lint checks
  time: 0.022; rss: 109MB   indexing hir
  time: 0.000; rss: 96MB    load query result cache
  time: 0.000; rss: 96MB    looking for entry point
  time: 0.000; rss: 96MB    looking for plugin registrar
  time: 0.001; rss: 96MB    loop checking
  time: 0.002; rss: 96MB    attribute checking
  time: 0.004; rss: 96MB    stability checking
  time: 0.012; rss: 114MB   type collecting
  time: 0.000; rss: 114MB   outlives testing
  time: 0.000; rss: 114MB   impl wf inference
  time: 0.026; rss: 122MB   coherence checking
  time: 0.000; rss: 122MB   variance testing
  time: 0.022; rss: 128MB   wf checking
  time: 0.006; rss: 128MB   item-types checking
  time: 11.912; rss: 138MB  item-bodies checking
  time: 0.033; rss: 139MB   rvalue promotion
  time: 0.016; rss: 139MB   privacy checking
  time: 0.002; rss: 139MB   intrinsic checking
  time: 8.093; rss: 146MB   match checking
  time: 0.004; rss: 146MB   liveness checking
  time: 0.261; rss: 170MB   borrow checking
  time: 0.000; rss: 170MB   MIR borrow checking
  time: 0.000; rss: 170MB   dumping chalk-like clauses
  time: 0.000; rss: 170MB   MIR effect checking
  time: 0.004; rss: 170MB   death checking
  time: 0.002; rss: 170MB   unused lib feature checking
  time: 0.012; rss: 170MB   lint checking
  time: 0.000; rss: 170MB   resolving dependency formats
    time: 0.177; rss: 171MB write metadata
      time: 0.000; rss: 171MB   collecting roots
      time: 0.002; rss: 171MB   collecting mono items
    time: 0.002; rss: 171MB monomorphization collection
    time: 0.000; rss: 171MB codegen unit partitioning
    time: 0.007; rss: 174MB codegen to LLVM IR
    time: 0.000; rss: 174MB assert dep graph
    time: 0.000; rss: 174MB serialize dep graph
  time: 0.188; rss: 174MB   codegen
    time: 0.006; rss: 153MB llvm function passes [unicode_normalization.boavajsv-cgu.0]
    time: 0.079; rss: 122MB llvm module passes [unicode_normalization.boavajsv-cgu.0]
    time: 0.114; rss: 128MB codegen passes [unicode_normalization.boavajsv-cgu.0]
  time: 0.207; rss: 127MB   LLVM passes
  time: 0.000; rss: 126MB   serialize work products
  time: 0.005; rss: 124MB   linking
    Finished release [optimized] target(s) in 21.12s

@jens1o
Copy link
Contributor Author

jens1o commented Oct 31, 2018

happy to know I'm not the only one :)

@jens1o
Copy link
Contributor Author

jens1o commented Dec 14, 2018

Is there something to do to get this triaged?

@Manishearth Manishearth added the I-slow Issue: Problems and improvements with respect to performance of generated code. label Dec 14, 2018
@nikic nikic added I-compiletime Issue: Problems and improvements with respect to compile times. and removed I-slow Issue: Problems and improvements with respect to performance of generated code. labels Dec 15, 2018
bors added a commit that referenced this issue Jan 10, 2019
Speed up item_bodies for large match statements involving regions

These changes don't change anything about the complexity of the algorithms, but use some easy shortcuts or modifications to cut down some overhead.

The first change, which incrementally removes the constraints from the set we're iterating over probably introduces some overhead for small to medium sized constraint sets, but it's not big enough for me to observe it in any project I tested against (not that many though).

Though most other crates probably won't improve much at all, because huge matches aren't that common, the changes seemed simple enough for me to make them.

Ref #55528

cc unicode-rs/unicode-normalization#29

r? @nikomatsakis
bors added a commit that referenced this issue Jan 12, 2019
Avoid layout calculations in assert_bits to speed up match checking

assert_bits ensures that the given type matches the type of the constant
value, and additionally performs a query for the layout of the given
type to get its size. This is then used to assert that it matches the
size of the constant. But since the types are already known to be the
same, this second check is unnecessary, and skipping it also allows to
skip the expensive layout query.

For the unicode_normalization crate, the match checking time drops from
about 3.8s to about 0.8s for me.

Ref #55528

cc unicode-rs/unicode-normalization#29
pietroalbini added a commit to pietroalbini/rust that referenced this issue Jan 12, 2019
Speed up item_bodies for large match statements involving regions

These changes don't change anything about the complexity of the algorithms, but use some easy shortcuts or modifications to cut down some overhead.

The first change, which incrementally removes the constraints from the set we're iterating over probably introduces some overhead for small to medium sized constraint sets, but it's not big enough for me to observe it in any project I tested against (not that many though).

Though most other crates probably won't improve much at all, because huge matches aren't that common, the changes seemed simple enough for me to make them.

Ref rust-lang#55528

cc unicode-rs/unicode-normalization#29

r? @nikomatsakis
Centril added a commit to Centril/rust that referenced this issue Jan 13, 2019
Speed up item_bodies for large match statements involving regions

These changes don't change anything about the complexity of the algorithms, but use some easy shortcuts or modifications to cut down some overhead.

The first change, which incrementally removes the constraints from the set we're iterating over probably introduces some overhead for small to medium sized constraint sets, but it's not big enough for me to observe it in any project I tested against (not that many though).

Though most other crates probably won't improve much at all, because huge matches aren't that common, the changes seemed simple enough for me to make them.

Ref rust-lang#55528

cc unicode-rs/unicode-normalization#29

r? @nikomatsakis
@nnethercote
Copy link
Contributor

Note that #57718 has some detailed profile measurements.

@nnethercote
Copy link
Contributor

Good news: since January 18, when this benchmark was added to the rustc-perf suite, compilation has gotten more than 2x faster.

#64673 and #65089 are two recent changes that account for a big chunk of the improvements.

@mati865
Copy link
Contributor

mati865 commented Oct 12, 2019

On my box running build --release on unicode-rs/unicode-normalization@7c23cc9 takes less than 2 seconds (with target directory removed).

@jens1o is this still slow for you?

@nnethercote
Copy link
Contributor

#65260 reduced compile time some more, by up to 7%.

@jens1o
Copy link
Contributor Author

jens1o commented Oct 19, 2019

It is faster now, and I'm happy with it. :) Shall we close this issue?

@nnethercote
Copy link
Contributor

Shall we close this issue?

Yes!

p.s. #65480 just chopped off another 5-7% :)

@nnethercote
Copy link
Contributor

#66537 chops off another 2.4%.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-compiletime Issue: Problems and improvements with respect to compile times.
Projects
None yet
Development

No branches or pull requests

6 participants