Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend refactorings and perf improvements #6867

Merged
merged 10 commits into from
Apr 25, 2019

Conversation

alexcrichton
Copy link
Member

This PR is extracted from #6864 and then also additionally adds some commits related to performance optimizations that I noticed while profiling #6864. Each commit is in theory standalone and should pass all the tests, as well as being descriptive about what it's doing.

I... don't think this needed any more! Doing some digging looks like
this was originally added in 79768eb. That was so early it didn't even
use a PR and was almost 5 years ago. Since then we've had a huge number
of changes to the backend and `Unit` nowadays does all the deduplication
we need, so no need to store a `Vec` here and we can just have a mapping
of `Key` to `Job` and that's it.
This looks like it was a bug ever present from the original
implementation of a GNU jobserver in rust-lang#4110, but we currently
unconditionally request a token is allocated for any job we pull off our
job queue. Rather we only need to request tokens for everything but the
first job because we already have an implicit token for that job.
Take a `&Context` argument instead of a few component arguments, avoids
passing around some extraneous information.
This isn't actually necessary with a bit of refactoring, and in general
it makes management of `JobQueue` simpler since there's one less type to
worry about. The one main usage of `Key` vs `Unit` was that `Key` was
`Send` to be placed in `Message`, but this was replaced by just
assigning each `Unit` an ever-increasing integer, and then `Finished`
contains the integer rather than the `Key` itself which we can map once
we get back to the main thread.
The code lying around in `dep_targets` is pretty old at this point, but
given the current iteraiton there's no need to re-sort on each call to
`dep_targets`, only initially during construction!
We only need to parse this information once, so calculate it when a
`TargetInfo` is created and then just reuse that from then on out.
This commit starts to intern `Unit` structures for a few reasons:

* This primarily makes equality and hashing much faster. We have tons of
  hash lookups with units, and they were showing up quite high
  in profiles. It turns out `Unit` hashes a *ton* of data, and most of
  it is always redundant. To handle this they're all only hashed once
  now and hashing/equality are just pointer checks.

* The size of `Unit` is now drastically reduced to just one pointer, so
  movement of units throughout the backend should be much more
  efficient.
@rust-highfive
Copy link

r? @ehuss

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 22, 2019
@alexcrichton
Copy link
Member Author

I should mention that the overall performance wins of this were pretty modest, I was seeing ~10% with an average of 180-190ms for a null build of Cargo itself, whereas afterwards it's 170-180ms but the variances are quite high.

This commit moves it from a linear query which does a lot of hashing of
`PathBuf` to an `O(1)` query that only hashes `PackageId`. This improves
Servo's null build performance by 50ms or so, causing a number of
functions to disappear from profiles.
This commit moves a linear scan which happens once-per-each-dependency
to an O(1) lookup which happens only once for each package. This removes
another 30ms or so from a null build in Servo.
@alexcrichton
Copy link
Member Author

I should also note that the lion's share of a possible improvement to Cargo null build time will likely be #6866 as that's >50% of a null build from my measurements. While we can't get it down to zero probably we can make it way faster in theory.

Copy link
Contributor

@ehuss ehuss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I've been concerned about how big Unit is, and I'm a little surprised it doesn't help more, but I guess computers are fast at copying things.

r=me with the comment.

fn intern_inner(&'a self, item: &UnitInner<'a>) -> &'a UnitInner<'a> {
let mut me = self.state.borrow_mut();
if let Some(item) = me.cache.get(item) {
return unsafe { &*(&**item as *const UnitInner<'a>) };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand what this is doing, but can you add a comment explaining it, along with why it is safe?

Is this correct: Starting with &Box<UnitInner> it pulls UnitInner out of the box, casts to a pointer to force the borrow checker to forget about the borrow, then converts the pointer back to &UnitInner?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct yeah, and I'll leave some comments, an excellent point!

@alexcrichton
Copy link
Member Author

@bors: r=ehuss

@bors
Copy link
Collaborator

bors commented Apr 23, 2019

📌 Commit 32269f4 has been approved by ehuss

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 23, 2019
@bors
Copy link
Collaborator

bors commented Apr 23, 2019

⌛ Testing commit 32269f4 with merge b13ed92aaada88b7053fdcc0d871251ff3b34a55...

@bors
Copy link
Collaborator

bors commented Apr 23, 2019

💔 Test failed - checks-travis

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 23, 2019
@ehuss
Copy link
Contributor

ehuss commented Apr 23, 2019

Hmm, something looks wrong with rustup, it's failing for me elsewhere.

@alexcrichton
Copy link
Member Author

@bors: retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 25, 2019
@bors
Copy link
Collaborator

bors commented Apr 25, 2019

⌛ Testing commit 32269f4 with merge 6562942...

bors added a commit that referenced this pull request Apr 25, 2019
Backend refactorings and perf improvements

This PR is extracted from #6864 and then also additionally adds some commits related to performance optimizations that I noticed while profiling #6864. Each commit is in theory standalone and should pass all the tests, as well as being descriptive about what it's doing.
@bors
Copy link
Collaborator

bors commented Apr 25, 2019

☀️ Test successful - checks-travis, status-appveyor
Approved by: ehuss
Pushing 6562942 to master...

@bors bors merged commit 32269f4 into rust-lang:master Apr 25, 2019
bors added a commit that referenced this pull request Apr 30, 2019
Add explicit lifetimes to Executor::init.

Since the unit interner was added in #6867, I am getting a strange borrow check error compiling RLS with the latest Cargo (that is, linking rls with the new cargo).  Adding explicit lifetimes to the `Executor::init` function seems to fix the problem.  I don't 100% understand the error, because none of the relevant function or type signatures changed in that PR.  The only thing that gives me a clue is [this change](https://github.com/rust-lang/cargo/pull/6867/files#diff-0bb62cb712c0811a17d7a726e068bf65L112) to `BuildPlan::add` which does something similar, but that is not directly called by RLS.

The error looks like this:
```
error[E0623]: lifetime mismatch
   --> rls/src/build/cargo_plan.rs:149:24
    |
126 |         unit: &Unit<'_>,
    |                -------- these two types are declared with different lifetimes...
127 |         cx: &Context<'_, '_>,
    |              ---------------
...
149 |         let units = cx.dep_targets(unit);
    |                        ^^^^^^^^^^^ ...but data from `unit` flows into `cx` here
```

I generally don't like making changes if I don't understand them, but I'm a little stumped here how this is happening even though everything is defined with the same (`'a`) lifetime.

This unblocks updating RLS to the latest Cargo.
@alexcrichton alexcrichton deleted the improvements branch May 6, 2019 18:12
@ehuss ehuss added this to the 1.36.0 milestone Feb 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants