Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let a portion of DefPathHash uniquely identify the DefPath's crate. #81635

Merged
merged 5 commits into from
Mar 8, 2021

Conversation

michaelwoerister
Copy link
Member

This allows to directly map from a DefPathHash to the crate it originates from, without constructing side tables to do that mapping -- something that is useful for incremental compilation where we deal with DefPathHash instead of DefId a lot.

It also allows to reliably and cheaply check for DefPathHash collisions which allows the compiler to gracefully abort compilation instead of running into a subsequent ICE at some random place in the code.

The following new piece of documentation describes the most interesting aspects of the changes:

/// A `DefPathHash` is a fixed-size representation of a `DefPath` that is
/// stable across crate and compilation session boundaries. It consists of two
/// separate 64-bit hashes. The first uniquely identifies the crate this
/// `DefPathHash` originates from (see [StableCrateId]), and the second
/// uniquely identifies the corresponding `DefPath` within that crate. Together
/// they form a unique identifier within an entire crate graph.
///
/// There is a very small chance of hash collisions, which would mean that two
/// different `DefPath`s map to the same `DefPathHash`. Proceeding compilation
/// with such a hash collision would very probably lead to an ICE and, in the
/// worst case, to a silent mis-compilation. The compiler therefore actively
/// and exhaustively checks for such hash collisions and aborts compilation if
/// it finds one.
///
/// `DefPathHash` uses 64-bit hashes for both the crate-id part and the
/// crate-internal part, even though it is likely that there are many more
/// `LocalDefId`s in a single crate than there are individual crates in a crate
/// graph. Since we use the same number of bits in both cases, the collision
/// probability for the crate-local part will be quite a bit higher (though
/// still very small).
///
/// This imbalance is not by accident: A hash collision in the
/// crate-local part of a `DefPathHash` will be detected and reported while
/// compiling the crate in question. Such a collision does not depend on
/// outside factors and can be easily fixed by the crate maintainer (e.g. by
/// renaming the item in question or by bumping the crate version in a harmless
/// way).
///
/// A collision between crate-id hashes on the other hand is harder to fix
/// because it depends on the set of crates in the entire crate graph of a
/// compilation session. Again, using the same crate with a different version
/// number would fix the issue with a high probability -- but that might be
/// easier said then done if the crates in questions are dependencies of
/// third-party crates.
///
/// That being said, given a high quality hash function, the collision
/// probabilities in question are very small. For example, for a big crate like
/// `rustc_middle` (with ~50000 `LocalDefId`s as of the time of writing) there
/// is a probability of roughly 1 in 14,750,000,000 of a crate-internal
/// collision occurring. For a big crate graph with 1000 crates in it, there is
/// a probability of 1 in 36,890,000,000,000 of a `StableCrateId` collision.

Given the probabilities involved I hope that no one will ever actually see the error messages. Nonetheless, I'd be glad about some feedback on how to improve them. Should we create a GH issue describing the problem and possible solutions to point to? Or a page in the rustc book?

r? @pnkfelix (feel free to re-assign)

@michaelwoerister michaelwoerister added C-enhancement Category: An issue proposing an enhancement or a PR with one. A-diagnostics Area: Messages for errors, warnings, and lints A-incr-comp Area: Incremental compilation A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) labels Feb 1, 2021
@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 1, 2021
@rust-log-analyzer

This comment has been minimized.

@michaelwoerister
Copy link
Member Author

I'll need to take a closer look to see if those changed crate-disambiguators are an expected outcome of this PR.

@michaelwoerister
Copy link
Member Author

The first test failure is expected. Error messages in that lint are sorted by DefPathHash, so it's no surprise that their order can change, if hash values change (although it might be a good idea to sort them by span or something else that is less "random").

@michaelwoerister
Copy link
Member Author

The symbol name changes are due to my making Fingerprint::to_smaller_hash() depend on both halves of the fingerprint here. So those are expected too.

I'll update the test cases asap.

michaelwoerister and others added 2 commits February 2, 2021 17:40
This allows to directly map from a DefPathHash to the crate it
originates from, without constructing side tables to do that mapping.

It also allows to reliably and cheaply check for DefPathHash collisions.
@wesleywiser
Copy link
Member

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 3, 2021
@bors
Copy link
Contributor

bors commented Feb 3, 2021

⌛ Trying commit 6a4878f with merge efd074000fc38c9a79ba7eb808e5d7635211dd2e...

@bors
Copy link
Contributor

bors commented Feb 3, 2021

☀️ Try build successful - checks-actions
Build commit: efd074000fc38c9a79ba7eb808e5d7635211dd2e (efd074000fc38c9a79ba7eb808e5d7635211dd2e)

@rust-timer
Copy link
Collaborator

Queued efd074000fc38c9a79ba7eb808e5d7635211dd2e with parent b593389, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (efd074000fc38c9a79ba7eb808e5d7635211dd2e): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 3, 2021
@bjorn3
Copy link
Member

bjorn3 commented Feb 3, 2021

Mostly slight regressions of up to 0.7% with an 1.8% improvement on deeply-nested-async-check and a few other improvements of up to 0.7%.

@michaelwoerister
Copy link
Member Author

Performance changes look like noise 👍

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@bors bors added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 7, 2021
@cjgillot
Copy link
Contributor

cjgillot commented Mar 7, 2021

This PR is cursed.
@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 7, 2021
@bors
Copy link
Contributor

bors commented Mar 7, 2021

⌛ Testing commit 9e50544 with merge d49cde51341d84140f2cc93014d67a02d1f6be5b...

@bors
Copy link
Contributor

bors commented Mar 7, 2021

💥 Test timed out

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 7, 2021
@rust-log-analyzer
Copy link
Collaborator

A job failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

@JohnTitor
Copy link
Member

@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 7, 2021
@bors
Copy link
Contributor

bors commented Mar 7, 2021

⌛ Testing commit 9e50544 with merge 76c500e...

@bors
Copy link
Contributor

bors commented Mar 8, 2021

☀️ Test successful - checks-actions
Approved by: pnkfelix
Pushing 76c500e to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Mar 8, 2021
@bors bors merged commit 76c500e into rust-lang:master Mar 8, 2021
@rustbot rustbot added this to the 1.52.0 milestone Mar 8, 2021
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 30, 2021
Don't duplicate the extern providers once for each crate

This should give a small perf improvement for small crates by avoiding a memcpy of a pretty big struct for each loaded crate. In addition would be useful for replacing the sequential `CrateNum` everywhere with the hash based `StableCrateId` introduced in rust-lang#81635, which would allow avoiding remapping of `CrateNum`'s when loading crate metadata. While this PR is not strictly needed for that, it is necessary to prevent a performance loss due to it.

I think this duplication was done in rust-lang#40008 (which introduced the query system) to make it possible to compile multiple crates in a single session in the future. I think this is unlikely to be implemented any time soon. In addition this PR can easily be reverted if necessary to implement this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-incr-comp Area: Incremental compilation A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) C-enhancement Category: An issue proposing an enhancement or a PR with one. merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.