Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rollup of 19 pull requests #50608

Closed
wants to merge 52 commits into from
Closed

Conversation

alexcrichton
Copy link
Member

Successful merges:

Failed merges:

rizakrko and others added 30 commits April 22, 2018 20:38
Because they are simple and hot.

This change speeds up some incremental runs of a few rustc-perf
benchmarks, the best by 3%.
This avoids a decent number of allocations, enough to speed up
incremental runs of many rustc-benchmarks, the best by 2%.
This commit is spawned out of a performance regression investigation in rust-lang#50496.
In tracking down this regression it turned out that the `expand_statements`
function in the compiler was taking quite a long time. Further investigation
showed two key properties:

* The function was "fast" on glibc 2.24 and slow on glibc 2.23
* The hottest function was memmove from glibc

Combined together it looked like glibc gained an optimization to the memmove
function in 2.24. Ideally we don't want to rely on this optimization, so I
wanted to dig further to see what was happening.

The hottest part of `expand_statements` was `Drop for Drain` in the call to
`splice` where we insert new statements into the original vector. This *should*
be a cheap operation because we're draining and replacing iterators of the exact
same length, but under the hood memmove was being called a lot, causing a
slowdown on glibc 2.23.

It turns out that at least one of the optimizations in glibc 2.24 was that
`memmove` where the src/dst are equal becomes much faster. [This program][prog]
executes in ~2.5s against glibc 2.23 and ~0.3s against glibc 2.24, exhibiting
how glibc 2.24 is optimizing `memmove` if the src/dst are equal.

And all that brings us to what this commit itself is doing. The change here is
purely to `Drop for Drain` to avoid the call to `ptr::copy` if the region being
copied doesn't actually need to be copied. For normal usage of just `Drain`
itself this check isn't really necessary, but because `Splice` internally
contains `Drain` this provides a nice speed boost on glibc 2.23. Overall this
should fix the regression seen in rust-lang#50496 on glibc 2.23 and also fix the
regression on Windows where `memmove` looks to not have this optimization.

Note that the way `splice` was called in `expand_statements` would cause a
quadratic number of elements to be copied via `memmove` which is likely why the
tuple-stress benchmark showed such a severe regression.

Closes rust-lang#50496

[prog]: https://gist.github.com/alexcrichton/c05bc51c6771bba5ae5b57561a6c1cd3
also make a drive-by typo fix
When the RawVec::try_reserve* methods were added, they took the place of
the ::reserve* methods in the source file, and new ::reserve* methods
wrapping the new try_reserve* methods were created. But the
documentation didn't move along, such that:
 - reserve_* methods are barely documented.
 - try_reserve_* methods have unmodified documentation from reserve_*,
   such that their documentation indicate they are panicking/aborting.

This moves the documentation back to the right methods, with a
placeholder documentation for the try_reserve* methods.
…ducts, r=michaelwoerister

Make DepGraph::previous_work_products immutable

Fixes rust-lang#50501

r? @michaelwoerister
… r=Zoxc

Don't use Lock for heavily accessed CrateMetadata::cnum_map.

The `cnum_map` in `CrateMetadata` is used for two things:
1. to map `CrateNums` between crates (used a lot during decoding)
2. to construct the (reverse) post order of the crate graph

For the second case, we need to modify the map after the fact, which is why the map is wrapped in a `Lock`. This is bad for the first case, which does not need the modification and does lots of small reads from the map.

This PR splits case (2) out into a separate `dependencies` field. This allows to make the `cnum_map` immutable (and shifts the interior mutability to a less busy data structure).

Fixes rust-lang#50502

r? @Zoxc
 Make CrateNum allocation more thread-safe.

This PR makes sure that we can't have race conditions when assigning CrateNums. It's a slight improvement but a larger refactoring of the CrateStore/CrateLoader infrastructure would be good, I think.

r? @Zoxc
…petrochenkov

Inline `Span` methods.

Because they are simple and hot.

This change speeds up some incremental runs of a few rustc-perf
benchmarks, the best by 3%.

Here are the ones with a speedup of at least 1%:
```
coercions
        avg: -1.1%      min: -3.4%      max: -0.2%
html5ever-opt
        avg: -0.8%      min: -1.7%      max: -0.2%
clap-rs-check
        avg: -0.3%      min: -1.4%      max: 0.7%
html5ever
        avg: -0.7%      min: -1.2%      max: -0.4%
html5ever-check
        avg: -0.9%      min: -1.1%      max: -0.8%
clap-rs
        avg: -0.4%      min: -1.1%      max: -0.1%
crates.io-check
        avg: -0.8%      min: -1.0%      max: -0.6%
serde-opt
        avg: -0.6%      min: -1.0%      max: -0.3%
```
…elwoerister

Use SmallVec for DepNodeIndex within dep_graph.

This avoids a decent number of allocations, enough to speed up
incremental runs of many rustc-benchmarks, the best by 2%.

Here are the rustc-perf benchmarks that showed an improvement of at least 1% on one run:
```
unused-warnings-check
	avg: -1.7%	min: -2.4%	max: 0.0%
unused-warnings-opt
	avg: -1.4%	min: -2.0%	max: 0.0%
unused-warnings
	avg: -1.4%	min: -2.0%	max: -0.0%
tokio-webpush-simple-check
	avg: -1.0%	min: -1.7%	max: 0.0%
futures-opt
	avg: -0.9%	min: -1.6%	max: 0.0%
encoding
	avg: -1.2%	min: -1.6%	max: -0.6%
encoding-check
	avg: -0.9%	min: -1.6%	max: 0.0%
encoding-opt
	avg: -0.8%	min: -1.6%	max: -0.1%
futures
	avg: -0.9%	min: -1.5%	max: 0.0%
futures-check
	avg: -0.9%	min: -1.5%	max: 0.1%
regression-31157-check
	avg: -0.9%	min: -1.5%	max: 0.0%
regex
	avg: -0.6%	min: -1.4%	max: 0.0%
regression-31157-opt
	avg: -0.5%	min: -1.4%	max: 0.1%
regression-31157
	avg: -0.7%	min: -1.4%	max: 0.2%
regex-opt
	avg: -0.6%	min: -1.4%	max: 0.1%
hyper-check
	avg: -0.8%	min: -1.4%	max: -0.1%
regex-check
	avg: -1.0%	min: -1.4%	max: 0.0%
hyper-opt
	avg: -0.7%	min: -1.4%	max: -0.1%
hyper
	avg: -0.7%	min: -1.3%	max: 0.1%
piston-image-opt
	avg: -0.4%	min: -1.3%	max: 0.0%
tokio-webpush-simple-opt
	avg: -0.3%	min: -1.3%	max: 0.0%
piston-image-check
	avg: -0.5%	min: -1.3%	max: -0.0%
syn-opt
	avg: -0.5%	min: -1.3%	max: 0.0%
clap-rs-check
	avg: -0.3%	min: -1.3%	max: 0.2%
piston-image
	avg: -0.5%	min: -1.2%	max: 0.1%
syn
	avg: -0.5%	min: -1.2%	max: 0.1%
syn-check
	avg: -0.6%	min: -1.2%	max: -0.1%
issue-46449-opt
	avg: -0.4%	min: -1.2%	max: -0.1%
parser-check
	avg: -0.7%	min: -1.2%	max: 0.1%
issue-46449
	avg: -0.5%	min: -1.2%	max: -0.0%
```
…r=alexcrichton

Allow for specifying a linker plugin for cross-language LTO

This PR makes the `-Zcross-lang-lto` flag optionally take the path to the `LLVMgold.so` linker plugin. If this path is specified, `rustc` will invoke the linker with the correct arguments (i.e. `-plugin` and various `-plugin-opt`s).

This can be used to ergonomically enable cross-language LTO for Rust programs with C/C++ dependencies:
```
clang -O2 test.c -otest.o -c -flto=thin
llvm-ar -rv libxxx.a test.o
rustc -L. main.rs -Zcross-lang-lto=/usr/lib64/LLVMgold.so -O -Clink-arg=-fuse-ld=gold
```

- Note that in theory this should work with Gold, LLD, and newer versions of binutils' LD but on my current system I could only get it to work with Gold.
- Also note that this will work best if the Clang version and Rust's LLVM version are close enough. Clang 6.0 works well with the current nightly.

r? @alexcrichton
Don't require clippy/miri for beta

r? @kennytm

cc @alexcrichton

I'm trying this out locally atm to see if it works as I think it should. Not sure how to test it for real except wait for the next beta.

fixes rust-lang#50557
…SimonSapin

add fn `into_inner(self) -> (Idx, Idx)` to RangeInclusive (rust-lang#49022)

adds `into_inner(self) -> (Idx, Idx)` to RangeInclusive
rust-lang#49022 (comment)
…fackler

std: Avoid `ptr::copy` if unnecessary in `vec::Drain`

This commit is spawned out of a performance regression investigation in rust-lang#50496.
In tracking down this regression it turned out that the `expand_statements`
function in the compiler was taking quite a long time. Further investigation
showed two key properties:

* The function was "fast" on glibc 2.24 and slow on glibc 2.23
* The hottest function was memmove from glibc

Combined together it looked like glibc gained an optimization to the memmove
function in 2.24. Ideally we don't want to rely on this optimization, so I
wanted to dig further to see what was happening.

The hottest part of `expand_statements` was `Drop for Drain` in the call to
`splice` where we insert new statements into the original vector. This *should*
be a cheap operation because we're draining and replacing iterators of the exact
same length, but under the hood memmove was being called a lot, causing a
slowdown on glibc 2.23.

It turns out that at least one of the optimizations in glibc 2.24 was that
`memmove` where the src/dst are equal becomes much faster. [This program][prog]
executes in ~2.5s against glibc 2.23 and ~0.3s against glibc 2.24, exhibiting
how glibc 2.24 is optimizing `memmove` if the src/dst are equal.

And all that brings us to what this commit itself is doing. The change here is
purely to `Drop for Drain` to avoid the call to `ptr::copy` if the region being
copied doesn't actually need to be copied. For normal usage of just `Drain`
itself this check isn't really necessary, but because `Splice` internally
contains `Drain` this provides a nice speed boost on glibc 2.23. Overall this
should fix the regression seen in rust-lang#50496 on glibc 2.23 and also fix the
regression on Windows where `memmove` looks to not have this optimization.

Note that the way `splice` was called in `expand_statements` would cause a
quadratic number of elements to be copied via `memmove` which is likely why the
tuple-stress benchmark showed such a severe regression.

Closes rust-lang#50496

[prog]: https://gist.github.com/alexcrichton/c05bc51c6771bba5ae5b57561a6c1cd3
Restore RawVec::reserve* documentation

When the RawVec::try_reserve* methods were added, they took the place of
the ::reserve* methods in the source file, and new ::reserve* methods
wrapping the new try_reserve* methods were created. But the
documentation didn't move along, such that:
 - reserve_* methods are barely documented.
 - try_reserve_* methods have unmodified documentation from reserve_*,
   such that their documentation indicate they are panicking/aborting.

This moves the documentation back to the right methods, with a
placeholder documentation for the try_reserve* methods.
…ichaelwoerister

Remove unnecessary mutable borrow and resizing in DepGraph::serialize

I might be mistaken, but I noticed this whilst in this file for something else. It appears that this mutable borrow is unnecessary and since it's locking it should be removed. The resizing looks redundant since nothing additional is added to the fingerprints in this function, so that can also be removed.
…richton

Retry when downloading the Docker cache.

As a safety measure, prevent spuriously needing to rebuild the docker image in case the network was reset while downloading.

Also, adjusted the retry function to insert a sleep between retries, because retrying immediately will often just hit the same issue.
@rust-highfive
Copy link
Collaborator

r? @cramertj

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive
Copy link
Collaborator

warning Warning warning

  • These commits modify submodules.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 10, 2018
@alexcrichton
Copy link
Member Author

@bors: r+ p=10

@bors
Copy link
Contributor

bors commented May 10, 2018

📌 Commit bc516e1 has been approved by alexcrichton

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 10, 2018
@bors
Copy link
Contributor

bors commented May 10, 2018

🔒 Merge conflict

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 10, 2018
@alexcrichton alexcrichton deleted the rollup branch May 10, 2018 16:34
@bors
Copy link
Contributor

bors commented May 10, 2018

☔ The latest upstream changes (presumably #50395) made this pull request unmergeable. Please resolve the merge conflicts.

@Centril Centril added the rollup A PR which is a rollup label Oct 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rollup A PR which is a rollup S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.