Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce -Zsplit-metadata option #120855

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

bjorn3
Copy link
Member

@bjorn3 bjorn3 commented Feb 9, 2024

This will split the crate metadata out of library files. Instead only the svh and a bit of extra metadata is preserved to allow for loading the right rmeta file. This significantly reduces library size. In addition it allows for cheaper checks if different library files are the same crate.

A fair amount of the complexity in this PR is to work around the fact that cargo doesn't directly support this option yet.

Fixes #23366
Closes #29511
Fixes #57076

Revives #93945

@rustbot
Copy link
Collaborator

rustbot commented Feb 9, 2024

r? @cjgillot

rustbot has assigned @cjgillot.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 9, 2024
rlibs.insert(loc_canon, PathKind::ExternFlag);
} else if loc.file_name().unwrap().to_str().unwrap().ends_with(".rmeta") {
rmetas.insert(loc_canon, PathKind::ExternFlag);
} else {
rmetas.insert(loc_canon.with_extension("rmeta"), PathKind::ExternFlag);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a hack. It should probably be replaced with proper lookup in the library search path, or with cargo passing --extern twice for both the rlib/dylib and the rmeta file.

if extension == "so" || extension == "dylib" {
// FIXME workaround for the fact that cargo doesn't understand `-Zsplit-metadata`
toplevel.push((file_stem.clone(), "rmeta".to_owned(), None));
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This together with the rustc wrapper passing --emit metadata is a big hack working around the fact that cargo doesn't support -Zsplit-metadata and as such doesn't cause a .rmeta file to be emitted and also doesn't track it in the compiler-artifact json message. This also causes issues with recompilations as the loop below may copy outdated rmeta files into the sysroot, which would then result in compiler errors.

@@ -1,4 +1,4 @@
// normalize-stderr-test "loaded from .*libstd-.*.rlib" -> "loaded from SYSROOT/libstd-*.rlib"
// normalize-stderr-test "loaded from .*libstd-.*.rmeta" -> "loaded from SYSROOT/libstd-*.rmeta"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .rmeta file is now considered the canonical one as the .rlib doesn't contain any crate metadata beyond the header.

@bjorn3
Copy link
Member Author

bjorn3 commented Feb 9, 2024

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 9, 2024
@bors
Copy link
Contributor

bors commented Feb 9, 2024

⌛ Trying commit 7770002 with merge 8fb65ac...

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 9, 2024
Introduce -Zsplit-metadata option

This will split the crate metadata out of library files. Instead only the svh and a bit of extra metadata is preserved to allow for loading the right rmeta file. This significantly reduces library size. In addition it allows for cheaper checks if different library files are the same crate.

A fair amount of the complexity in this PR is to work around the fact that cargo doesn't directly support this option yet.

Fixes rust-lang#23366
Fixes rust-lang#57076

Revives rust-lang#93945
@bjorn3
Copy link
Member Author

bjorn3 commented Feb 9, 2024

What would be the best way to handle the necessary cargo changes? We can't use them until the bootstrap compiler includes the cargo changes, but cargo would depend on -Zsplit-metadata which doesn't exist yet. Would unconditionally creating a .rmeta file for dylibs like we do for rlibs be fine? Or should I add a -Zsplit-metadata option to cargo without it getting tested and then once the bootstrap bump happens, change this PR to use this option? Or land -Zsplit-metadata first without actually making any use of it, then update cargo and finally make use of it in the standard library?

@rust-log-analyzer

This comment has been minimized.

@petrochenkov petrochenkov self-assigned this Feb 9, 2024
@bors
Copy link
Contributor

bors commented Feb 9, 2024

☀️ Try build successful - checks-actions
Build commit: 8fb65ac (8fb65acf1a11f547302c62c93bf83cca3ff4cd9d)

@rust-timer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (8fb65ac): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.4% [0.3%, 2.7%] 7
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.4% [-0.5%, -0.4%] 3
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
7.1% [5.4%, 8.4%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-5.2% [-5.2%, -5.2%] 1
All ❌✅ (primary) - - 0

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 666.509s -> 669.311s (0.42%)
Artifact size: 307.99 MiB -> 301.34 MiB (-2.16%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Feb 9, 2024
@bjorn3
Copy link
Member Author

bjorn3 commented Feb 10, 2024

This should save about 4MB on the download size of a toolchain without any extra targets other than the host:

component e28fae5 (no split metadata) 8fb65ac (split metadata) diff
rustc 70.54MB 68.39MB -2.15MB
rust-std 27.63MB 25.61MB -2MB
rustc-dev 110.84MB 110.90MB +0.06MB

For Bevy the size of the target dir shrinks a lot:

mode normal build --emit metadata -Zsplit-metadata diff relative diff
debug mode 4.6GiB 4.3GiB -300MiB -6.5%
debug mode (no incr comp) 2.8GiB 2.6GiB -200MiB -7%
release mode 843.6MiB 614.3MiB -230MiB -27%

@bors
Copy link
Contributor

bors commented Feb 16, 2024

☔ The latest upstream changes (presumably #120486) made this pull request unmergeable. Please resolve the merge conflicts.

@cjgillot cjgillot removed their assignment Feb 18, 2024
@petrochenkov
Copy link
Contributor

petrochenkov commented Mar 7, 2024

@bjorn3
Could you

  • Split 2eb24d9 into a separate PR in case it causes regressions
  • Split all preparatory refactorings like 1065559 into a separate PR
  • Rebase the remaining changes, make CI green, cleanup history, and add FIXME comments to logic that only exists due to lacking cargo support

(I'll be able to review the first two PRs quickly.)
@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 7, 2024
@rust-log-analyzer

This comment has been minimized.

jhpratt added a commit to jhpratt/rust that referenced this pull request Mar 9, 2024
…trochenkov

Remove a workaround for a bug

I don't think it is necessary anymore. As I understand it from issue 39504 the original problem was that rustbuild changed a hardlink in the cargo build dir to point to copy in the sysroot while cargo may have hardlinked it to the original first. I don't think this happens anymore and as such this workaround is no longer necessary.

Split out of rust-lang#120855
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Mar 9, 2024
…trochenkov

Remove a workaround for a bug

I don't think it is necessary anymore. As I understand it from issue 39504 the original problem was that rustbuild changed a hardlink in the cargo build dir to point to copy in the sysroot while cargo may have hardlinked it to the original first. I don't think this happens anymore and as such this workaround is no longer necessary.

Split out of rust-lang#120855
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Mar 9, 2024
…, r=petrochenkov

Move metadata header and version checks together

This will make it easier to report rustc versions for older metadata formats.

Split out of rust-lang#120855
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Mar 9, 2024
…, r=petrochenkov

Move metadata header and version checks together

This will make it easier to report rustc versions for older metadata formats.

Split out of rust-lang#120855
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Mar 9, 2024
Rollup merge of rust-lang#122187 - bjorn3:merge_header_version_checks, r=petrochenkov

Move metadata header and version checks together

This will make it easier to report rustc versions for older metadata formats.

Split out of rust-lang#120855
@bors
Copy link
Contributor

bors commented Mar 9, 2024

☔ The latest upstream changes (presumably #122241) made this pull request unmergeable. Please resolve the merge conflicts.

github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Mar 10, 2024
…chenkov

Move metadata header and version checks together

This will make it easier to report rustc versions for older metadata formats.

Split out of rust-lang/rust#120855
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 10, 2024
…ochenkov

Remove a workaround for a bug

I don't think it is necessary anymore. As I understand it from issue 39504 the original problem was that rustbuild changed a hardlink in the cargo build dir to point to copy in the sysroot while cargo may have hardlinked it to the original first. I don't think this happens anymore and as such this workaround is no longer necessary.

Split out of rust-lang#120855
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@bjorn3 bjorn3 force-pushed the split_metadata4 branch 2 times, most recently from a82c40c to fe45297 Compare July 1, 2024 19:18
@bors
Copy link
Contributor

bors commented Jul 11, 2024

☔ The latest upstream changes (presumably #126777) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Oct 18, 2024

☔ The latest upstream changes (presumably #131869) made this pull request unmergeable. Please resolve the merge conflicts.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

bjorn3 and others added 3 commits November 27, 2024 16:24
This will split the crate metadata out of library files. Instead only
the svh is preserved to allow for loading the right rmeta file. This
significicantly reduces library size. In addition it allows for cheaper
checks if different library files are the same crate.
@rust-log-analyzer
Copy link
Collaborator

The job x86_64-gnu-llvm-18 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
#16 exporting to docker image format
#16 sending tarball 27.7s done
#16 DONE 34.4s
##[endgroup]
Setting extra environment values for docker:  --env ENABLE_GCC_CODEGEN=1 --env GCC_EXEC_PREFIX=/usr/lib/gcc/
[CI_JOB_NAME=x86_64-gnu-llvm-18]
debug: `DISABLE_CI_RUSTC_IF_INCOMPATIBLE` configured.
---
sccache: Starting the server...
##[group]Configure the build
configure: processing command line
configure: 
configure: build.configure-args := ['--build=x86_64-unknown-linux-gnu', '--llvm-root=/usr/lib/llvm-18', '--enable-llvm-link-shared', '--set', 'rust.randomize-layout=true', '--set', 'rust.thin-lto-import-instr-limit=10', '--enable-verbose-configure', '--enable-sccache', '--disable-manage-submodules', '--enable-locked-deps', '--enable-cargo-native-static', '--set', 'rust.codegen-units-std=1', '--set', 'dist.compression-profile=balanced', '--dist-compression-formats=xz', '--set', 'rust.lld=false', '--disable-dist-src', '--release-channel=nightly', '--enable-debug-assertions', '--enable-overflow-checks', '--enable-llvm-assertions', '--set', 'rust.verify-llvm-ir', '--set', 'rust.codegen-backends=llvm,cranelift,gcc', '--set', 'llvm.static-libstdcpp', '--enable-new-symbol-mangling']
configure: target.x86_64-unknown-linux-gnu.llvm-config := /usr/lib/llvm-18/bin/llvm-config
configure: llvm.link-shared     := True
configure: rust.randomize-layout := True
configure: rust.thin-lto-import-instr-limit := 10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
7 participants