Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stable hash wip #63

Closed
wants to merge 3 commits into from
Closed

Stable hash wip #63

wants to merge 3 commits into from

Conversation

weihanglo
Copy link
Owner

No description provided.

This helps `-Ztrim-paths` build a stable cross-platform path for the
registry and git sources. Sources files then can be found from the same
path when debugging.

See rust-lang#13171 (comment)

A few caveats:

* This will invalidate the current downloaded caches.
  Need to put this in the Cargo CHANGELOG.
* As a consequence of changing how `SourceId` is hashed, the global cache
  tracker is also affected because Cargo writes source identifiers (e.g.
  `index.crates.io-6f17d22bba15001f`) to SQLite.
  * https://github.com/rust-lang/cargo/blob/6e236509b2331eef64df844b7bbc8ed352294107/src/cargo/core/global_cache_tracker.rs#L388-L391
* The performance of rustc-stable-hash is slightly worse than the old
  SipHasher in std on short things like `SourceId`, but for long stuff
  like fingerprint. See appendix.

StableHasher is used in several places (some might not be needed?):

* Rebuild detection (fingerprints)
  * Rustc version, including all the CLI args running `rustc -vV`.
    * https://github.com/rust-lang/cargo/blob/6e236509b2331eef64df844b7bbc8ed352294107/src/cargo/util/rustc.rs#L326
    * https://github.com/rust-lang/cargo/blob/6e236509b2331eef64df844b7bbc8ed352294107/src/cargo/util/rustc.rs#L381
  * Build caches
    * https://github.com/rust-lang/cargo/blob/6e236509b2331eef64df844b7bbc8ed352294107/src/cargo/core/compiler/fingerprint/mod.rs#L1456
* Compute rustc `-C metadata`
  * stable hash for SourceId
    * https://github.com/rust-lang/cargo/blob/6e236509b2331eef64df844b7bbc8ed352294107/src/cargo/core/package_id.rs#L207
  * Also read and hash contents from custom target JSON file.
    * https://github.com/rust-lang/cargo/blob/6e236509b2331eef64df844b7bbc8ed352294107/src/cargo/core/compiler/compile_kind.rs#L81-L91
* `UnitInner::dep_hash`
  * This is to distinguish same units having different features set between normal and build dependencies.
    * https://github.com/rust-lang/cargo/blob/6e236509b2331eef64df844b7bbc8ed352294107/src/cargo/ops/cargo_compile/mod.rs#L627
* Hash file contents for `cargo package` to verify if files were modified before and after the build.
  * https://github.com/rust-lang/cargo/blob/6e236509b2331eef64df844b7bbc8ed352294107/src/cargo/ops/cargo_package.rs#L999
* Rusc diagnostics deduplication
  * https://github.com/rust-lang/cargo/blob/6e236509b2331eef64df844b7bbc8ed352294107/src/cargo/core/compiler/job_queue/mod.rs#L311
* Places using `SourceId` identifier like `registry/src` path,
  and `-Zscript` target directories.

Appendix
--------

Benchmark on x86_64-unknown-linux-gnu

```
bench_hasher/RustcStableHasher/URL
                        time:   [33.843 ps 33.844 ps 33.845 ps]
                        change: [-0.0167% -0.0049% +0.0072%] (p = 0.44 > 0.05)
                        No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
  5 (5.00%) low severe
  3 (3.00%) high mild
  2 (2.00%) high severe
bench_hasher/SipHasher/URL
                        time:   [18.954 ns 18.954 ns 18.955 ns]
                        change: [-0.1281% -0.0951% -0.0644%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 14 outliers among 100 measurements (14.00%)
  3 (3.00%) low severe
  4 (4.00%) low mild
  3 (3.00%) high mild
  4 (4.00%) high severe
bench_hasher/RustcStableHasher/lorem ipsum
                        time:   [659.18 ns 659.20 ns 659.22 ns]
                        change: [-0.0192% -0.0062% +0.0068%] (p = 0.34 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) low severe
  3 (3.00%) low mild
  3 (3.00%) high mild
  2 (2.00%) high severe
bench_hasher/SipHasher/lorem ipsum
                        time:   [1.2006 µs 1.2008 µs 1.2010 µs]
                        change: [+0.0117% +0.0467% +0.0808%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
```
@weihanglo weihanglo closed this Jul 9, 2024
@weihanglo weihanglo deleted the stable-hash-wip branch July 9, 2024 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant