Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Regression in cargo package on 1.81.0 #14955

Open
landonxjames opened this issue Dec 18, 2024 · 6 comments
Open

Performance Regression in cargo package on 1.81.0 #14955

landonxjames opened this issue Dec 18, 2024 · 6 comments
Assignees
Labels
C-bug Category: bug Command-package Performance Gotta go fast! S-triage Status: This issue is waiting on initial triage.

Comments

@landonxjames
Copy link

landonxjames commented Dec 18, 2024

Problem

The aws-sdk-rust recently updated our MSRV to 1.81.0. This caused a 3-4x slowdown in our cargo package invocation. We traced the likely culprit of this slowdown to #13960 which removes an if !opts.allow_dirty check around the check_repo_state function causing it to run on ever packaging step.

This causes a problem with the aws-sdk-rust repo. Since it is very large and has a huge history all invocations to git in that repository are slow. Since cargo package is now invoking git on every packaging step in the workspace it slows the whole process down to a crawl.

Steps

  1. Run git clone https://github.com/awslabs/aws-sdk-rust.git to checkout the aws-sdk-rust repo (this will be a bit slow, it is huge)
  2. Ensure that you are using a cargo version <1.81
  3. In the aws-sdk-rust repo run cargo package --no-verify --allow-dirty --workspace observe the approximate pace at which crates are packaged (you can also use the time command, but this fails locally for me with Too many open files (os error 24) before it completes)
  4. Upgrade your cargo version to =1.81
  5. Again run time cargo package --no-verify --allow-dirty --workspace (this goes too slowly for me to wait for it to fail, but it is very easy to observe the difference in speed)

Possible Solution(s)

Potentially when packaging multiple crates it might be possible to only invoke git once at the beginning of the run? Or potentially add a new option to disable the behavior introduced in #13960 that always generates a .cargo_vcs_info.json

Notes

No response

Version

$ cargo version --verbose
cargo 1.81.0 (2dbb1af80 2024-08-20)
release: 1.81.0
commit-hash: 2dbb1af80a2914475ba76827a312e29cedfa6b2f
commit-date: 2024-08-20
host: aarch64-apple-darwin
libgit2: 1.8.1 (sys:0.19.0 vendored)
libcurl: 8.7.1 (sys:0.4.73+curl-8.8.0 system ssl:(SecureTransport) LibreSSL/3.3.6)
ssl: OpenSSL 1.1.1w  11 Sep 2023
os: Mac OS 14.7.1 [64-bit]
@epage
Copy link
Contributor

epage commented Dec 18, 2024

Potentially when packaging multiple crates it might be possible to only invoke git once at the beginning of the run?

The dirty check specifically checks for whether any files being packaged are dirty which makes at least part of this a per-package operation. However, may parts could be skipped or moved earlier, depending on where the slow down is.

@weihanglo
Copy link
Member

#13960 is the culprit of this regression. We do git ls-files for every package unconditionally after that PR. When -Zpackage-workspace is stabilized, the issue will be more prominent.

@weihanglo
Copy link
Member

Going to do a refactor on moving this out from the package loop.

@rustbot claims

@weihanglo weihanglo self-assigned this Dec 18, 2024
github-merge-queue bot pushed a commit that referenced this issue Dec 19, 2024
### What does this PR try to resolve?

This helps debug <#14955>.

### How should we test and review this PR?

While `check_repo_state` is the culprit, let's add some traces for
future.

### Additional information
@weihanglo
Copy link
Member

Did some profiling. Detailed report in #14962.

To summarize, we were doing git status and comparing dirty files against each package to be published. aws-rust-sdk has 400+ members so such dirty comparison has repeated 400+ times.

Here are profiling data: trace.tar.gz

On 59b2ddd (trace-offline.json)

Before

#14962 (trace-offline-pathspec.json)

After

@weihanglo
Copy link
Member

If we skipped the entire git status check (the --allow-dirty behavior prior to #13960), the profiling data would look like this:

Image

@weihanglo
Copy link
Member

#14962 was closed because we found more bugs in cargo package VCS dirtiness check, showing in #14967. If #14962 merged those bugs would be harder to fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: bug Command-package Performance Gotta go fast! S-triage Status: This issue is waiting on initial triage.
Projects
None yet
Development

No branches or pull requests

3 participants