-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track -Cprofile-use
and -Cprofile-sample-use
value by file hash, not file path
#100413
base: master
Are you sure you want to change the base?
Conversation
f8e7566
to
0ed3178
Compare
☔ The latest upstream changes (presumably #100595) made this pull request unmergeable. Please resolve the merge conflicts. |
0ed3178
to
db05f78
Compare
This comment has been minimized.
This comment has been minimized.
db05f78
to
df62250
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, I just discovered that my comments from a few days ago are still "pending". So, here goes:
compiler/rustc_session/src/utils.rs
Outdated
let mut hasher = Md5::default(); | ||
|
||
let mut file = File::open(path)?; | ||
let mut buffer = [0; 4096]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it actually makes a difference but sccache seems to use a 128KB buffer for "best performance": https://cs.github.com/mozilla/sccache/blob/2af14599a6c8c591ff5c40bf96e62c47efebec63/src/util.rs?q=128#L56-L57
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Higher block probably means worse L1 cache reuse, but 128 KiB should be fine. I'll change it to this size.
compiler/rustc_session/src/utils.rs
Outdated
} | ||
|
||
fn hash_file(path: &Path) -> std::io::Result<Output<Md5>> { | ||
let mut hasher = Md5::default(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just using StableHasher
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, AFAIK StableHasher
is optimized for frequent and small updates, while here we want to hash large chunks of bytes.
I originally wanted to use Blake3, but decided on MD5 to avoid adding a new dependency. I created a small benchmark, which hashed a 4GiB profile file with random data (primed in file cache, so hopefully without much I/O overhead) with 128 KiB block size:
MD5: 9200 - 10000ms
BLAKE3: 1099 - 1100ms
StableHasher: 1687 - 1744ms
BLAKE3 looks like a good choice, especially since it was designed for use-cases like hashing files. But the stable hasher also isn't half bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Btw I also tried swapping the hashing algorithm for source files for blake3, but it didn't really help. Well, most source files don't have 4 GiB :) ).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
XXH3 would be the ideal choice ;)
Do we already have a dependency on BLAKE3? I agree that it looks like a great choice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't, I would have to add it to rustc_session
. It uses CC0 1.0
/Apache 2.0
license.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, when we do PGO for LLVM in CI, all the files have 16 GiB, but after they are combined into a single file, it's just under 30 MiB. For rustc
it's 80 MiB. So maybe large optimizations here are unnecessary and we can just use StableHasher
to avoid a new dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all the files have 16 GiB
😅
I'd just go with StableHasher for now, I think. We can always switch later.
compiler/rustc_session/src/utils.rs
Outdated
_error_format: ErrorOutputType, | ||
_for_crate_hash: bool, | ||
) { | ||
// Q1: Should we also hash the filepath itself? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Three options:
- Don't hash the file path (it probably does not influence anything)
- Hash the file path but declare all
HashedFilePath
options[TRACKED_NO_CRATE_HASH]
so we don't break reproducible builds. - Hash the file path but remap it with
--remap-path-prefix
so we don't break reproducible builds (although that's probably annoying for build system maintainers).
Overall, I'm not sure about the approach. Maybe Another way to do it would be to turn retrieving the paths into
The The advantage of that approach is that the query system will take care of caching the result of the expensive operation, and that changes to the contents of the The downside is that it's quite a bit more complex. |
This could actually be quite useful, because I think that for local experimentation it's quite common to repeatedly recompile the code with different profiles. Now if the profile changes, it will recompile all dependencies, which can be quite time consuming. Would this invalidation also works for depedencies? In other words, if I implement this query and the profile changes, will it only do codegen for dependencies, without recompiling the rest of the dependencies? |
IIRC, Cargo does not compile dependencies incrementally because it makes the assumption that those change very infrequently. Under that assumption, the query approach would not stop dependencies from being recompiled completely. I'm not sure if the CARGO_INCREMENTAL env var has any influence on this. You could check by running Cargo with |
Indeed |
Yes, let's do that. Maybe I'd rename |
df62250
to
827bc7d
Compare
Indeed, that's a good idea. Before if the file changed during the compilation, it could be hashed several times, each time with a different hash, which is bad. I renamed the struct, switched to |
I just realized that the current implementation will cause the file to be hashed in non-incremental mode too :/ That doesn't seem optimal, right? |
Since the But even then, I think that we should hash the profile even with non-incremental mode. Consider this: $ CARGO_INCREMENTAL=0 RUSTFLAGS="-Cprofile-use=profile.prof" cargo build --release
# Change profile.prof
$ CARGO_INCREMENTAL=0 RUSTFLAGS="-Cprofile-use=profile.prof" cargo build --release
# In current rustc, the crate is not recompiled! I think that we should invalidate the profile even in non-incremental mode, or not? |
Decisions about recompiling something or not are done by Cargo. The hash we are talking about does not influence anything in non-incremental mode, if I understand correctly. Cargo would need to recognize the |
To elaborate, there's a two-level process involved here:
|
Aha. Well, but this fact makes this whole PR obsolete :) Since Cargo won't even invoke |
I don't think the PR is obsolete -- we still need to deal with the cases where rustc does get invoked with |
I discussed this on the Cargo zulip stream (https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/Tracking.20PGO.20profile.20files.20in.20cargo) and it seems that there's a way forward. Since this change is relatively self-contained, should we merge it now? Or do you want me to make the Cargo related changes in this PR too? |
I think the cargo related changes should go into a separate PR. Regarding this PR: I'm still not really happy about hashing the file in non-incremental mode. I'll think about it some more over the weekend. |
…chaelwoerister Track PGO profiles in depinfo This PR makes sure that PGO profiles (`-Cprofile-use` and `-Cprofile-sample-use`) are tracked in depinfo, so that when they change, the compilation session will be invalidated. This approach was discussed on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/Tracking.20PGO.20profile.20files.20in.20cargo). I tried it locally and it seems that the code is recompiled just with this change, and rust-lang#100413 is not even needed. But it's possible that not everything required is recompiled, so we will probably want to land both changes. Another approach to implement this could be to store the PGO profiles in `sess.parse_sess.file_depinfo` when the session is being created, but then the paths would have to be converted to a string and then to a symbol, which seemed unnecessarily complicated. CC `@michaelwoerister` r? `@Eh2406`
☔ The latest upstream changes (presumably #101577) made this pull request unmergeable. Please resolve the merge conflicts. |
@Kobzol, if you are still interested in this: I think the best way forward is to handle this in the query system. That way we won't do any unnecessary work in non-incremental and in incremental mode, invalidation will be limited to just the LLVM part. I imagine it to work like this:
The rest would be handled by the existing incr. comp. infrastructure. |
This issue is not so pressing for me now because the profile paths are tracked by Cargo, so it will rebuild the crates correctly if the profiles change. That being said, it would be nice to also fix this on the |
Based on this comment I'll tentatively switch review status. Feel free to request a review with @rustbot author |
@Kobzol any updates on this? |
I'll be looking at an issue with I still think that the issue this PR addresses should be fixed eventually and I don't mind leaving it open and assigned to me as a reminder. Unless of course @Kobzol wants to get rid of the open PR 🙂 Then we can migrate this to a bug report. |
Hmm. In that case i'll mark it as blocked till that work is done |
…ister Track PGO profiles in depinfo This PR makes sure that PGO profiles (`-Cprofile-use` and `-Cprofile-sample-use`) are tracked in depinfo, so that when they change, the compilation session will be invalidated. This approach was discussed on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/Tracking.20PGO.20profile.20files.20in.20cargo). I tried it locally and it seems that the code is recompiled just with this change, and rust-lang/rust#100413 is not even needed. But it's possible that not everything required is recompiled, so we will probably want to land both changes. Another approach to implement this could be to store the PGO profiles in `sess.parse_sess.file_depinfo` when the session is being created, but then the paths would have to be converted to a string and then to a symbol, which seemed unnecessarily complicated. CC `@michaelwoerister` r? `@Eh2406`
…ister Track PGO profiles in depinfo This PR makes sure that PGO profiles (`-Cprofile-use` and `-Cprofile-sample-use`) are tracked in depinfo, so that when they change, the compilation session will be invalidated. This approach was discussed on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/Tracking.20PGO.20profile.20files.20in.20cargo). I tried it locally and it seems that the code is recompiled just with this change, and rust-lang/rust#100413 is not even needed. But it's possible that not everything required is recompiled, so we will probably want to land both changes. Another approach to implement this could be to store the PGO profiles in `sess.parse_sess.file_depinfo` when the session is being created, but then the paths would have to be converted to a string and then to a symbol, which seemed unnecessarily complicated. CC `@michaelwoerister` r? `@Eh2406`
I'm unassigning myself (see rust-lang/team#1565). I think the general outline in #100413 (comment) would still be a valid approach to solve the problem. |
Before, the path to PGO profile passed in
-Cprofile-use
and-Cprofile-sample-use
was tracked by the filepath only. This meant that if the code was compiled twice in a row with the same path, the crate would not be recompiled, even if the profile content has changed in the meantime.I'm not too excited about the used
md-5
crate's API, but it was already used for hashing source files, so I decided to keep the same dependency. I'm not really sure what thefor_crate_hash
argument is for, should I take it into account here?Fixes: #100397
r? @michaelwoerister