Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a cache for discovered workspace roots #10776

Merged
merged 2 commits into from
Jul 5, 2022

Conversation

Muscraft
Copy link
Member

History

@ehuss noticed that workspace inheritance caused a significant increase in startup times when using workspace inheritance. This brought up the creation of #10747.

When using a similar test setup to the original I got

Benchmark 1: cd rust; ../../../target/release/cargo metadata
  Time (mean ± σ):     149.4 ms ±   3.8 ms    [User: 105.9 ms, System: 31.7 ms]
  Range (min … max):   144.2 ms … 162.2 ms    19 runs
 
Benchmark 2: cd rust-ws-inherit; ../../../target/release/cargo metadata
  Time (mean ± σ):     191.6 ms ±   1.4 ms    [User: 145.9 ms, System: 34.2 ms]
  Range (min … max):   188.8 ms … 193.9 ms    15 runs

This showed a large increase in time per cargo command when using workspace inheritance.

During the investigation of this issue, other performance concerns were found and addressed. This resulted in a drop in time across the board but heavily favored workspace inheritance.

Benchmark 1: cd rust; ../../../target/release/cargo metadata
  Time (mean ± σ):     139.3 ms ±   1.7 ms    [User: 99.8 ms, System: 29.4 ms]
  Range (min … max):   137.1 ms … 144.5 ms    20 runs
 
Benchmark 2: cd rust-ws-inherit; ../../../target/release/cargo metadata
  Time (mean ± σ):     161.7 ms ±   1.4 ms    [User: 120.4 ms, System: 31.2 ms]
  Range (min … max):   158.0 ms … 164.6 ms    18 runs

Performance after changes

hyperfine --warmup 10 "cd rust; ../../../target/release/cargo metadata" "cd rust-ws-inherit; ../../../target/release/cargo metadata" --runs 40

Benchmark 1: cd rust; ../../../target/release/cargo metadata
  Time (mean ± σ):     140.1 ms ±   1.5 ms    [User: 99.5 ms, System: 30.7 ms]
  Range (min … max):   137.4 ms … 144.0 ms    40 runs
 
Benchmark 2: cd rust-ws-inherit; ../../../target/release/cargo metadata
  Time (mean ± σ):     141.8 ms ±   1.6 ms    [User: 100.9 ms, System: 30.9 ms]
  Range (min … max):   138.4 ms … 145.4 ms    40 runs

New Benchmark
cargo bench -- workspace_initialization/rust

workspace_initialization/rust
    time:   [14.779 ms 14.880 ms 14.997 ms]
workspace_initialization/rust-ws-inherit
    time:   [16.235 ms 16.293 ms 16.359 ms]

Changes Made

  • Pulled a commit from @ehuss that deduplicated finding a workspace root to make the changes easier
  • Added a cache in Config to hold found WorkspaceRootConfigs
    • This makes it so manifests should only be parsed once
  • Made WorkspaceRootConfig get added to the cache when parsing a manifest

Testing Steps

To check the new benchmark:

  1. cd benches/benchsuite
  2. cargo bench -- workspace_initialization/rust

Using hyperfine:

  1. run cargo build --release
  2. extract rust and rust-ws-inherit in benches/workspaces
  3. cd benches/workspaces
  4. Prime the target directory with a cache of rustc info. In rust and rust-ws-inherit, run: cargo +nightly c -p linkchecker. Otherwise it would be measuring rustc overhead.
  5. run hyperfine --warmup 10 "cd rust; ../../../target/release/cargo metadata" "cd rust-ws-inherit; ../../../target/release/cargo metadata" --runs 40

closes #10747

This centralizes the workspace discovery code into one location
`find_workspace_root_with_loader` so that it isn't duplicated.
@rust-highfive
Copy link

r? @ehuss

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 20, 2022
Copy link
Member

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me. Just a few questions:

  • Do we need to document how the result would be cached, as it affects some public API?
  • Is there a chance that there is a long-running cargo API consumer getting outdated caches? Should be really rare edge case I guess.

src/cargo/util/toml/mod.rs Outdated Show resolved Hide resolved
@Muscraft
Copy link
Member Author

Muscraft commented Jul 5, 2022

- Do we need to document how the result would be cached, as it affects some public API?

I'm not sure that this affects any public API to my knowledge. It should just make it so we don't parse a workspace root twice if it has not been loaded yet into a list of workspace packages or when we do not have access to the list of workspace packages.

- Is there a chance that there is a long-running cargo API consumer getting outdated caches? Should be really rare edge case I guess.

This shouldn't be a problem unless there is an edit to a workspace root manifest while cargo is running. If there is an edit while cargo is running, I think it is best to have the edit appear when cargo is run next either way.

Copy link
Member

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless there is an edit to a workspace root manifest while cargo is running

That's exact what I meant. Anyway, I am going to merge this. The benchmark result on my machine is as great as yours. And we can always add docs later if needed.

@weihanglo
Copy link
Member

@bors r+

Thanks for the PR!

@bors
Copy link
Contributor

bors commented Jul 5, 2022

📌 Commit fd8bcf9 has been approved by weihanglo

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 5, 2022
@bors
Copy link
Contributor

bors commented Jul 5, 2022

⌛ Testing commit fd8bcf9 with merge ca190ac...

@bors
Copy link
Contributor

bors commented Jul 5, 2022

☀️ Test successful - checks-actions
Approved by: weihanglo
Pushing ca190ac to master...

@bors bors merged commit ca190ac into rust-lang:master Jul 5, 2022
@Muscraft Muscraft deleted the cache-workspace-discovery branch July 6, 2022 19:21
weihanglo added a commit to weihanglo/rust that referenced this pull request Jul 10, 2022
…79e13a071ad4b17435bd6bfa4c

2022-07-03 13:41:11 +0000 to 2022-07-09 14:48:50 +0000

- Mention `[patch]` config in "Overriding Dependencies" (rust-lang/cargo#10836)
- Update terminal-width flag. (rust-lang/cargo#10833)
- Refactor check_yanked to avoid some duplication (rust-lang/cargo#10835)
- Fix publishing to crates.io with -Z sparse-registry (rust-lang/cargo#10831)
- Make `is_yanked` return `Poll<>` (rust-lang/cargo#10830)
- Fix corrupted git checkout recovery. (rust-lang/cargo#10829)
- add a cache for discovered workspace roots (rust-lang/cargo#10776)
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 10, 2022
Update cargo

7 commits in c0bbd42ce5e83fe2a93e817c3f9b955492d3130a..b1dd22e668af5279e13a071ad4b17435bd6bfa4c
2022-07-03 13:41:11 +0000 to 2022-07-09 14:48:50 +0000

- Mention `[patch]` config in "Overriding Dependencies" (rust-lang/cargo#10836)
- Update terminal-width flag. (rust-lang/cargo#10833)
- Refactor check_yanked to avoid some duplication (rust-lang/cargo#10835)
- Fix publishing to crates.io with -Z sparse-registry (rust-lang/cargo#10831)
- Make `is_yanked` return `Poll<>` (rust-lang/cargo#10830)
- Fix corrupted git checkout recovery. (rust-lang/cargo#10829)
- add a cache for discovered workspace roots (rust-lang/cargo#10776)
bors added a commit that referenced this pull request Jul 13, 2022
Fix nested workspace resolution

This fixes a bug that was introduced in #10776 with nested workspaces.

As an example, say we have two workspaces:

`/code/example/Cargo.toml` and `/code/example/sub/Cargo.toml`, and a crate within the `sub` workspace `/code/example/sub/test-crate/Cargo.toml`.

Since the `ws_roots` is a HashMap with randomized ordering, this code will _sometimes_ cause the workspace at `/code/example/Cargo.toml` to be discovered and used _before_ `/code/example/sub/Cargo.toml`,

https://github.com/rust-lang/cargo/blob/b1dd22e668af5279e13a071ad4b17435bd6bfa4c/src/cargo/core/workspace.rs#L1704-L1710

This will then cause the `validate_members` method to fail as the member thinks it is a member of a different workspace than it should be.

https://github.com/rust-lang/cargo/blob/b1dd22e668af5279e13a071ad4b17435bd6bfa4c/src/cargo/core/workspace.rs#L874-L891

This change just makes it so that the input manifest path is walked up to find the (presumably) most appropriate workspace so that the ordering of the `HashMap` doesn't matter.

If you encounter this bug by running cargo nightly, you can workaround it by adding the crate(s) to the `excluded` field in the workspace they don't belong to.
bors added a commit that referenced this pull request Jul 14, 2022
Add a test for regressions in selecting the correct workspace root

This adds a test to check for regressions in selecting the correct workspace when there are nested workspaces.

#10846 solved a problem with nested workspace resolution that was caused by #10776. `@ehuss` [suggested](#10846 (comment)) that a test should be added to ensure that this issue does not pop up again.

I ensured that this worked by testing against commit before #10846. Sporadically I would get an error that was the same as described in #10846.
```
error: package `{path}/cargo/target/tmp/cit/t0/foo/sub/foo/Cargo.toml` is a member of the wrong workspace
expected: {path}/cargo/target/tmp/cit/t0/foo/sub/Cargo.toml
actual:   {path}/cargo/target/tmp/cit/t0/foo/Cargo.toml
```
I then tested it on the commit with the fix and the test passed every time.

---

While this does add a test to catch any regression I am worried that it will not catch it every time.  It was noted in #10846 that this error would sometimes happen but not every time, in my testing I found this to be true as well. Since this is caused by the `HashMap` order changing each run, switching to something ordered like `BTreeMap` **_should_** catch any regressions every run (if the implementation were to ever change). I'm not sure if this is necessary so I figured I would note the concern here.
@ehuss ehuss added this to the 1.64.0 milestone Jul 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

workspace inheritiance slowing down cargo startup times
5 participants