-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
directly expose copy and copy_nonoverlapping intrinsics #81238
Conversation
r? @cramertj (rust-highfive has picked a reviewer for you, use r? to override) |
This comment has been minimized.
This comment has been minimized.
This loses the (currently disabled until they can be |
945ce7a
to
18d12ad
Compare
This is not the first time those debug assertions came under criticism; e.g., that's why they were turned into an So together with that |
@alecmocatta if you are comparing the removal of the wrappers with an addition of If MIR inlining were to be enabled by default, the inlining could take place at MIR level when code is still generic, amortizing some of those costs, but we aren't at this point yet. |
@tmiasko Thanks, that's helpful. My thinking is that
|
Another advantage of this PR is that |
@bors try @rust-timer queue |
Awaiting bors try build completion. |
⌛ Trying commit 18d12ad with merge 375082d2e33538dc4a882ee01db94cd61563372c... |
☀️ Try build successful - checks-actions |
Queued 375082d2e33538dc4a882ee01db94cd61563372c with parent 9a9477f, future comparison URL. @rustbot label: +S-waiting-on-perf |
Finished benchmarking try commit (375082d2e33538dc4a882ee01db94cd61563372c): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
Huge improvements of up to 2.3% with a few regressions of up to 0.8%. |
Looks good to me. The only downside is that
now disappears without a place to fix it. But since debug assertions in @bors r+ |
📌 Commit 1a80635 has been approved by |
☀️ Test successful - checks-actions |
directly expose copy and copy_nonoverlapping intrinsics This effectively un-does rust-lang#57997. That should help with `ptr::read` codegen in debug builds (and any other of these low-level functions that bottoms out at `copy`/`copy_nonoverlapping`), where the wrapper function will not get inlined. See the discussion in rust-lang#80290 and rust-lang#81163. Cc `@bjorn3` `@therealprof`
…imulacrum [stable] 1.52.0 release This includes the release notes (rust-lang#84183) as well as cherry-picked commits from: * [beta] revert PR rust-lang#77885 rust-lang#84710 * [beta] remove assert_matches rust-lang#84759 * Revert PR 81473 to resolve (on beta) issues 81626 and 81658. rust-lang#83171 * [beta] rustdoc revert deref recur rust-lang#84868 * Fix ICE of for-loop mut borrowck where no suggestions are available rust-lang#83401 Additionally in "fresh work" we're also: * reverting: directly expose copy and copy_nonoverlapping intrinsics rust-lang#81238 to avoid rust-lang#84297 on 1.52
…Mark-Simulacrum Make copy/copy_nonoverlapping fn's again Make copy/copy_nonoverlapping fn's again, rather than intrinsics. This a short-term change to address issue rust-lang#84297. It effectively reverts PRs rust-lang#81167 rust-lang#81238 (and part of rust-lang#82967), rust-lang#83091, and parts of rust-lang#79684.
…Mark-Simulacrum Make copy/copy_nonoverlapping fn's again Make copy/copy_nonoverlapping fn's again, rather than intrinsics. This a short-term change to address issue rust-lang#84297. It effectively reverts PRs rust-lang#81167 rust-lang#81238 (and part of rust-lang#82967), rust-lang#83091, and parts of rust-lang#79684.
Avoid using the `copy_nonoverlapping` wrapper through `mem::replace`. This is a much simpler way to achieve the pre-rust-lang#86003 behavior of `mem::replace` not needing dynamically-sized `memcpy`s (at least before inlining), than re-doing rust-lang#81238 (which needs rust-lang#86699 or something similar). I didn't notice it until recently, but `ptr::write` already explicitly avoided using the wrapper, while `ptr::read` just called the wrapper (and was the reason for us observing any behavior change from rust-lang#86003 in Rust-GPU). <hr/> The codegen test I've added fails without the change to `core::ptr::read` like this (ignore the `v0` mangling, I was using a worktree with it turned on by default, for this): ```llvm 13: ; core::intrinsics::copy_nonoverlapping::<u8> 14: ; Function Attrs: inlinehint nonlazybind uwtable 15: define internal void `@_RINvNtCscK5tvALCJol_4core10intrinsics19copy_nonoverlappinghECsaS4X3EinRE8_25mem_replace_direct_memcpy(i8*` %src, i8* %dst, i64 %count) unnamed_addr #0 { 16: start: 17: %0 = mul i64 %count, 1 18: call void `@llvm.memcpy.p0i8.p0i8.i64(i8*` align 1 %dst, i8* align 1 %src, i64 %0, i1 false) not:17 !~~~~~~~~~~~~~~~~~~~~~ error: no match expected 19: ret void 20: } ``` With the `core::ptr::read` change, `core::intrinsics::copy_nonoverlapping` doesn't get instantiated and the test passes. <hr/> r? `@m-ou-se` cc `@nagisa` (codegen test) `@oli-obk` / `@RalfJung` (miri diagnostic changes)
FYI, this got reverted shortly after landing due to being a breaking change, and the PR that would have made this possible got closed due to inactivity (#86699). However, #87827 should still help with code quality for |
This effectively un-does #57997. That should help with
ptr::read
codegen in debug builds (and any other of these low-level functions that bottoms out atcopy
/copy_nonoverlapping
), where the wrapper function will not get inlined. See the discussion in #80290 and #81163.Cc @bjorn3 @therealprof