Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subtree sync for rustc_codegen_cranelift #127162

Merged
merged 71 commits into from
Jun 30, 2024

Conversation

bjorn3
Copy link
Member

@bjorn3 bjorn3 commented Jun 30, 2024

The main highlight this time is support for arm64 macOS in cg_clif. A future PR will enable distributing it as rustup component.

r? @ghost

@rustbot label +A-codegen +A-cranelift +T-compiler

pacak and others added 30 commits April 19, 2024 08:31
…, r=RalfJung,nikic

compiler: add simd_ctpop intrinsic

Fairly straightforward addition.

cc `@rust-lang/opsem` new (extremely boring) intrinsic
Typical uses of ThinLTO don't have any use for this as a standalone
file, but distributed ThinLTO uses this to make the linker phase more
efficient. With clang you'd do something like `clang -flto=thin
-fthin-link-bitcode=foo.indexing.o -c foo.c` and then get both foo.o
(full of bitcode) and foo.indexing.o (just the summary or index part of
the bitcode). That's then usable by a two-stage linking process that's
more friendly to distributed build systems like bazel, which is why I'm
working on this area.

I talked some to @teresajohnson about naming in this area, as things
seem to be a little confused between various blog posts and build
systems. "bitcode index" and "bitcode summary" tend to be a little too
ambiguous, and she tends to use "thin link bitcode" and "minimized
bitcode" (which matches the descriptions in LLVM). Since the clang
option is thin-link-bitcode, I went with that to try and not add a new
spelling in the world.

Per @dtolnay, you can work around the lack of this by using `lld
--thinlto-index-only` to do the indexing on regular .o files of
bitcode, but that is a bit wasteful on actions when we already have all
the information in rustc and could just write out the matching minimized
bitcode. I didn't test that at all in our infrastructure, because by the
time I learned that I already had this patch largely written.
The standard library now has the right configs in it's Cargo.toml
rustc_codegen_llvm: add support for writing summary bitcode

Typical uses of ThinLTO don't have any use for this as a standalone file, but distributed ThinLTO uses this to make the linker phase more efficient. With clang you'd do something like `clang -flto=thin -fthin-link-bitcode=foo.indexing.o -c foo.c` and then get both foo.o (full of bitcode) and foo.indexing.o (just the summary or index part of the bitcode). That's then usable by a two-stage linking process that's more friendly to distributed build systems like bazel, which is why I'm working on this area.

I talked some to `@teresajohnson` about naming in this area, as things seem to be a little confused between various blog posts and build systems. "bitcode index" and "bitcode summary" tend to be a little too ambiguous, and she tends to use "thin link bitcode" and "minimized bitcode" (which matches the descriptions in LLVM). Since the clang option is thin-link-bitcode, I went with that to try and not add a new spelling in the world.

Per `@dtolnay,` you can work around the lack of this by using `lld --thinlto-index-only` to do the indexing on regular .o files of bitcode, but that is a bit wasteful on actions when we already have all the information in rustc and could just write out the matching minimized bitcode. I didn't test that at all in our infrastructure, because by the time I learned that I already had this patch largely written.
This replaces the drop_in_place reference with null in vtables. On
librustc_driver.so, this drops about ~17k dynamic relocations from the
output, since many vtables can now be placed in read-only memory, rather
than having a relocated pointer included.

This makes a tradeoff by adding a null check at vtable call sites.
That's hard to avoid without changing the vtable format (e.g., to use a
pc-relative relocation instead of an absolute address, and avoid the
dynamic relocation that way). But it seems likely that the check is
cheap at runtime.
Add an intrinsic for `ptr::metadata`

The follow-up to rust-lang#123840, so we can remove `PtrComponents` and `PtrRepr` from libcore entirely (well, after a bootstrap update).

As discussed in <https://rust-lang.zulipchat.com/#narrow/stream/189540-t-compiler.2Fwg-mir-opt/topic/.60ptr_metadata.60.20in.20MIR/near/435637808>, this introduces `UnOp::PtrMetadata` taking a raw pointer and returning the associated metadata value.

By no longer going through a `union`, this should also help future PRs better optimize pointer operations.

r? ``@oli-obk``
Show files produced by `--emit foo` in json artifact notifications

Right now it is possible to ask `rustc` to save some intermediate representation into one or more files with `--emit=foo`, but figuring out what exactly was produced is difficult. This pull request adds information about `llvm_ir` and `asm` intermediate files into notifications produced by `--json=artifacts`.

Related discussion: https://internals.rust-lang.org/t/easier-access-to-files-generated-by-emit-foo/20477

Motivation - `cargo-show-asm` parses those intermediate files and presents them in a user friendly way, but right now I have to apply some dirty hacks. Hacks make behavior confusing: hintron/computer-enhance#35

This pull request introduces a new behavior: now `rustc` will emit a new artifact notification for every artifact type user asked to `--emit`, for example for `--emit asm` those will include all the `.s` files.

Most users won't notice this behavior, to be affected by it all of the following must hold:
- user must use `rustc` binary directly (when `cargo` invokes `rustc` - it consumes artifact notifications and doesn't emit anything)
- user must specify both `--emit xxx` and `--json artifacts`
- user must refuse to handle unknown artifact types
- user must disable incremental compilation (or deal with it better than cargo does, or use a workaround like `save-temps`) in order not to hit rust-lang#88829 / rust-lang#89149
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. has-merge-commits PR has merge commits, merge with caution. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 30, 2024
@rustbot

This comment was marked as resolved.

@rustbot rustbot added A-codegen Area: Code generation A-cranelift Things relevant to the [future] cranelift backend labels Jun 30, 2024
@bjorn3
Copy link
Member Author

bjorn3 commented Jun 30, 2024

@bors r+ subtree sync

@bors
Copy link
Contributor

bors commented Jun 30, 2024

📌 Commit 9ec6a02 has been approved by bjorn3

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 30, 2024
@bjorn3
Copy link
Member Author

bjorn3 commented Jun 30, 2024

@bors p=1

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Jun 30, 2024
…bjorn3

Subtree sync for rustc_codegen_cranelift

The main highlight this time is support for arm64 macOS in cg_clif. A future PR will enable distributing it as rustup component.

r? `@ghost`

`@rustbot` label +A-codegen +A-cranelift +T-compiler
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 30, 2024
…iaskrgr

Rollup of 8 pull requests

Successful merges:

 - rust-lang#126018 (Remove the `box_pointers` lint.)
 - rust-lang#126895 (Fix simd_gather documentation)
 - rust-lang#126981 (Replace some magic booleans in match-lowering with enums)
 - rust-lang#127069 (small correction to fmt::Pointer impl)
 - rust-lang#127157 (coverage: Avoid getting extra unexpansion info when we don't need it)
 - rust-lang#127160 (Add a regression test for rust-lang#123630)
 - rust-lang#127161 (Improve `run-make-support` library `args` API)
 - rust-lang#127162 (Subtree sync for rustc_codegen_cranelift)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors
Copy link
Contributor

bors commented Jun 30, 2024

⌛ Testing commit 9ec6a02 with merge ef3d6fd...

bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 30, 2024
Distribute rustc_codegen_cranelift for arm64 macOS

Support for arm64 macOS has been added to rustc_codegen_cranelift recently.

Based on rust-lang#127162

try-job: dist-aarch64-apple
@bors
Copy link
Contributor

bors commented Jun 30, 2024

☀️ Test successful - checks-actions
Approved by: bjorn3
Pushing ef3d6fd to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 30, 2024
@bors bors merged commit ef3d6fd into rust-lang:master Jun 30, 2024
7 checks passed
@rustbot rustbot added this to the 1.81.0 milestone Jun 30, 2024
@bjorn3 bjorn3 deleted the sync_cg_clif-2024-06-30 branch June 30, 2024 18:34
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 30, 2024
Distribute rustc_codegen_cranelift for arm64 macOS

Support for arm64 macOS has been added to rustc_codegen_cranelift recently.

Based on rust-lang#127162

try-job: dist-aarch64-apple
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (ef3d6fd): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.2% [-0.2%, -0.2%] 1
All ❌✅ (primary) - - 0

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

Results (secondary -2.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.0% [-2.0%, -2.0%] 1
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 696.099s -> 697.442s (0.19%)
Artifact size: 324.69 MiB -> 324.64 MiB (-0.02%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-cranelift Things relevant to the [future] cranelift backend has-merge-commits PR has merge commits, merge with caution. merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.