Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relocate upvars to Unresumed state and make coroutine prefix trivial #127522

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

dingxiangfei2009
Copy link
Contributor

@dingxiangfei2009 dingxiangfei2009 commented Jul 9, 2024

r? @pnkfelix

Replace #120168
Related to #62958

cc @Dirbaio

@Dirbaio made a much better approach to the upvar question compared to #120168, so that we do not need to modify any dataflow framework. His idea is that we should move the upvars to MIR locals directly in a prologue right after the ByMoveBody pass. This potentially opens up more optimization options.

Let us follow up in the t-compiler thread again for further works.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 9, 2024
@rustbot
Copy link
Collaborator

rustbot commented Jul 9, 2024

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

@dingxiangfei2009 dingxiangfei2009 force-pushed the move-upvars-to-locals branch from b802668 to 7e8b4cd Compare July 9, 2024 15:55
// because of a yield. We see that there is no yield in the scope of
// `b` and give the more generic error message.
let mut a = &3;
let a = &mut &3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you change this test? if this changes observable behavior, then it probably needs an FCP at least.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. We might need one.

On a closer look, if I understand this test correctly, it does not seem to test the right thing? a: &i32 is moved into the coroutine b and it is assigned to another borrowed i32 with a shorter lifetime, which should have been allowed. Relocating upvars with this PR actually let this original test to work, which is really confusing.

Judging by the intended error message, I am proposing a more appropriate test as laid out in this patch.

@rust-log-analyzer

This comment has been minimized.

@@ -117,11 +126,14 @@ LL | move || {
LL | check_clone(&gen_non_clone);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ within `{coroutine@$DIR/clone-impl.rs:79:5: 79:12}`, the trait `Clone` is not implemented for `NonClone`, which is required by `{coroutine@$DIR/clone-impl.rs:79:5: 79:12}: Clone`
|
note: captured value does not implement `Clone`
--> $DIR/clone-impl.rs:81:14
note: coroutine does not implement `Clone` as this value is used across a yield
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this change?

Copy link
Contributor Author

@dingxiangfei2009 dingxiangfei2009 Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the diagnostic has not been updated to recognise the locals or fields in Unresumed variant as captured values. There are a couple of more like this. Let me work on that!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, as long as it's not a change in behavior (like I mentioned above w/ that other example where the test changed) this could either be done here or in a follow-up.

@@ -299,6 +299,7 @@ fn mir_built(tcx: TyCtxt<'_>, def: LocalDefId) -> &Steal<Body<'_>> {
// by-move and by-mut bodies if needed. We do this first so
// they can be optimized in lockstep with their parent bodies.
&coroutine::ByMoveBody,
&coroutine::relocate_upvars::RelocateUpvars,
Copy link
Member

@compiler-errors compiler-errors Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be useful to leave a note why this needs to happen after ByMoveBody. If we flip these two coroutine passses, we get ICEs, right?

In other words, it would be useful to acknowledge that this pass breaks the invariants of ByMoveBody so it needs to happen afterwards.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True and indeed the transformed MIR made no sense because the types of the MIR locals would be wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also really like to see tests that demonstrate that both closure bodies (regular and by-move) still make sense after this pass

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed the transformed MIR made no sense because the types of the MIR locals would be wrong.

Yeah, though we could also modify the ByMoveBody pass to adjust the types on the MIR locals. But having it in this order keeps it much simpler IMO.

@rust-log-analyzer

This comment has been minimized.

Copy link
Contributor

@cjgillot cjgillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit puzzled by the motivation here.
From #120168, I understand that the point is to reduce the layout size. So why is this pass so early? Could it be done much later in the pipeline, for instance just before StateTransform which computes that layout? That would avoid any change to borrowck.

use rustc_span::Span;
use rustc_target::abi::FieldIdx;

pub struct RelocateUpvars;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give a high-level doc for what this pass does?

kind: TerminatorKind::Drop {
place: Place::from(ty::CAPTURE_STRUCT_LOCAL),
target: START_BLOCK,
unwind: UnwindAction::Cleanup(resume_block),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unwind: UnwindAction::Cleanup(resume_block),
unwind: UnwindAction::Continue,

?

let preds = body.basic_blocks.predecessors()[START_BLOCK].clone();
let basic_blocks = body.basic_blocks.as_mut();
for pred in preds {
match &mut basic_blocks[pred].terminator_mut().kind {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for target in basic_blocks[pred].terminator_mut().successors_mut() {
    if *target == START_BLOCK {
        *target = prologue;
    }
}

Comment on lines 47 to 74
let Some(&UpvarSubstitution { place: new_place, .. }) = self.mappings.get(*field_idx)
else {
bug!(
"SubstituteUpvar: found {field_idx:?} @ {location:?} but there is no upvar for it"
)
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just self.mappings.map[*field_idx]?

location: rustc_middle::mir::Location,
) {
if place.local == ty::CAPTURE_STRUCT_LOCAL
&& let [ProjectionElem::Field(field_idx, _ty), rest @ ..] = &**place.projection
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when we have a bare _1?

patch.apply(body);

// Manually patch so that prologue is the new entry
let preds = body.basic_blocks.predecessors()[START_BLOCK].clone();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: predecessors is cached. It may be interesting to chose the order in which we do things, to avoid clearing the cache just before trying to access it.

@Dirbaio
Copy link
Contributor

Dirbaio commented Jul 21, 2024

I believe there's some bug with the storage conflicts. this code

#![feature(noop_waker)]
use core::future::Future;
use core::task::{Context, Waker};

async fn bar() {}

async fn closury<F, Fut>(mut cb: F)
where
    F: FnMut(u32, u32) -> Fut,
    Fut: std::future::Future<Output = ()>,
{
    cb(1, 2).await;
}

async fn foo() {
    closury(|a, b| async move {
        bar().await;
        println!("{a} {b}");
    })
    .await
}

fn main() {
    let fut = foo();
    let fut = core::pin::pin!(fut);

    fut.poll(&mut Context::from_waker(Waker::noop()));
}

results in this CoroutineLayout (shortened a bit):

 CoroutineLayout {
    field_tys: {
        _0: CoroutineSavedTy { .. ), // coroutine for bar
        _1: CoroutineSavedTy { ty: u32, .. }
        _2: CoroutineSavedTy { ty: u32, .. }
        _3: CoroutineSavedTy { ty: u32, .. }
        _4: CoroutineSavedTy { ty: u32, .. }
    },
    variant_fields: {
        Unresumed(0): [_3, _4],
        Returned (1): [],
        Panicked (2): [],
        Suspend0 (3): [_0, _1, _2],
    },
    storage_conflicts: BitMatrix(5x5) {
        (_0, _0),
        (_0, _1),
        (_0, _2),
        (_1, _0),
        (_1, _1),
        (_1, _2),
        (_2, _0),
        (_2, _1),
        (_2, _2),
    },
}

but MIR contains this code when moving from Unresumed to Suspend0 states:

        nop;
        ((_1 as variant#3).1: u32) = move ((_1 as variant#0).0: u32);
        nop;
        ((_1 as variant#3).2: u32) = move ((_1 as variant#0).1: u32);

Between the two statements (_1 as variant#3).1 and (_1 as variant#0).1 are alive at the same time. This corresponds to saved locals _1 and _4. Therefore they should be marked as conflicting in storage_conflicts, but they aren't.

It seems the locals created from upvars in the Unresumed state are never marked as conflicting with anything.

I haven't actually been able to get code to misbehave with just the changes in this PR, but i'm doing some experiments that change the layout code to pack things more densely and i'm running into miscompilations due to this.

why is this pass so early? Could it be done much later in the pipeline, for instance just before StateTransform which computes that layout?

the hope was to take advantage of mir opts as much as possible. So, convert upvars to locals ASAP, let opts optimize them, then convert the result to the coroutine struct with StateTransform.

For example in the above case, the locals _3 and _1 could be merged, and _4 and _2 too. This'd hopefully be done with some MIR opt. This'd avoid having to do any moves at all when moving from Unresumed to Suspend0.

@bors
Copy link
Contributor

bors commented Jul 29, 2024

☔ The latest upstream changes (presumably #125443) made this pull request unmergeable. Please resolve the merge conflicts.

@compiler-errors compiler-errors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 10, 2024
@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Aug 20, 2024

☔ The latest upstream changes (presumably #122551) made this pull request unmergeable. Please resolve the merge conflicts.

@alex-semenyuk alex-semenyuk added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 23, 2024
@dingxiangfei2009 dingxiangfei2009 force-pushed the move-upvars-to-locals branch 3 times, most recently from be7b6fd to 1a68b6f Compare October 13, 2024 22:29
@rust-log-analyzer

This comment has been minimized.

@dingxiangfei2009
Copy link
Contributor Author

One more issue: the Suspend* variant names in the debuginfo transposes themselves. I am looking into it.

@rust-log-analyzer

This comment has been minimized.

@dingxiangfei2009
Copy link
Contributor Author

What has changed:

  • Debug info is sorted out

Pending actions:

  • I am looking for a good mitigation of the storage conflict issue. Right now, we don't have the ability to mark sub-places into variants as live or dead. I will look into options to handle this safely in the rustc_mir_transform::coroutine module.

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Oct 29, 2024

☔ The latest upstream changes (presumably #132326) made this pull request unmergeable. Please resolve the merge conflicts.

@dingxiangfei2009
Copy link
Contributor Author

Update:

Merge conflict resolved locally, but I would like to test a optimization technique. I will investigate the effectiveness and correctness of the slot merging, in hope of completely eliminating the problematic moves that are reported earlier.

@rustbot
Copy link
Collaborator

rustbot commented Nov 6, 2024

Some changes occurred to the CTFE machinery

cc @rust-lang/wg-const-eval

@rust-log-analyzer

This comment has been minimized.

dingxiangfei2009 and others added 2 commits November 10, 2024 23:14
Co-authored-by: Dario Nieuwenhuis <dirbaio@dirbaio.net>
Co-authored-by: Dario Nieuwenhuis <dirbaio@dirbaio.net>
@rust-log-analyzer
Copy link
Collaborator

The job x86_64-gnu-tools failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
tests/ui/ref_option/ref_option.rs (revision `all`) ... ok
tests/ui/ref_option/ref_option.all.fixed ... ok

FAILED TEST: tests/ui/future_not_send.rs
command: CLIPPY_CONF_DIR="tests" RUSTC_ICE="0" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/clippy-driver" "--error-format=json" "--emit=metadata" "-Aunused" "-Ainternal_features" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-Dwarnings" "-Ldependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps" "--extern=clippy_config=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libclippy_config-a25255c93374782e.rlib" "--extern=clippy_lints=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libclippy_lints-786a1be58753b44d.rlib" "--extern=clippy_utils=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libclippy_utils-d1721b2d0d98b47c.rlib" "--extern=futures=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libfutures-a6101dcb295c4849.rlib" "--extern=if_chain=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libif_chain-49fd91c51cbf146b.rlib" "--extern=itertools=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libitertools-a4b4cb2e17d4d2b3.rlib" "--extern=parking_lot=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libparking_lot-4bc7250265c18c99.rlib" "--extern=quote=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libquote-81b55a36b7b44123.rlib" "--extern=regex=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libregex-b0d46a029c23c2f7.rlib" "--extern=serde=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libserde-568b05c39f76ae6b.rlib" "--extern=serde_derive=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libserde_derive-4b1ee1ecc6818b87.so" "--extern=syn=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libsyn-d8bb53642979c62d.rlib" "--extern=tokio=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libtokio-808e345eab6bb5e5.rlib" "-Ldependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/release/deps" "--out-dir" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/ui_test/tests/ui" "tests/ui/future_not_send.rs" "--edition" "2021"
error: actual output differed from expected
Execute `cargo uibless` to update `tests/ui/future_not_send.stderr` to the actual output
--- tests/ui/future_not_send.stderr
+++ <stderr output>
---
 note: future is not `Send` as this value is used across an await
-  --> tests/ui/future_not_send.rs:9:20
+  --> tests/ui/future_not_send.rs:7:67
    |
 LL | async fn private_future(rc: Rc<[u8]>, cell: &Cell<usize>) -> bool {
-   |                         -- has type `std::rc::Rc<[u8]>` which is not `Send`
+   |                         --                                        ^ await occurs here, with `rc` maybe used later
+   |                         |
-LL |     async { true }.await
+   |                         has type `std::rc::Rc<[u8]>` which is not `Send`
    = note: `std::rc::Rc<[u8]>` doesn't implement `std::marker::Send`
    = note: `std::rc::Rc<[u8]>` doesn't implement `std::marker::Send`
-note: captured value is not `Send` because `&` references cannot be sent unless their referent is `Sync`
+note: future is not `Send` as this value is used across an await
+  --> tests/ui/future_not_send.rs:7:67
    |
    |
 LL | async fn private_future(rc: Rc<[u8]>, cell: &Cell<usize>) -> bool {
-   |                                       ^^^^ has type `&std::cell::Cell<usize>` which is not `Send`, because `std::cell::Cell<usize>` is not `Sync`
+   |                                       ----                        ^ await occurs here, with `cell` maybe used later
    = note: `std::cell::Cell<usize>` doesn't implement `std::marker::Sync`
    = note: `-D clippy::future-not-send` implied by `-D warnings`
    |
 note: future is not `Send` as this value is used across an await
-  --> tests/ui/future_not_send.rs:14:20
+  --> tests/ui/future_not_send.rs:12:42
+  --> tests/ui/future_not_send.rs:12:42
    |
 LL | pub async fn public_future(rc: Rc<[u8]>) {
-   |                            -- has type `std::rc::Rc<[u8]>` which is not `Send`
+   |                            --            ^ await occurs here, with `rc` maybe used later
+   |                            |
-LL |     async { true }.await;
+   |                            has type `std::rc::Rc<[u8]>` which is not `Send`
    = note: `std::rc::Rc<[u8]>` doesn't implement `std::marker::Send`
---
+note: future is not `Send` as this value is used across an await
-  --> tests/ui/future_not_send.rs:21:26
+  --> tests/ui/future_not_send.rs:21:68
    |
 LL | async fn private_future2(rc: Rc<[u8]>, cell: &Cell<usize>) -> bool {
-   |                          ^^ has type `std::rc::Rc<[u8]>` which is not `Send`
+   |                          --                                        ^ await occurs here, with `rc` maybe used later
    = note: `std::rc::Rc<[u8]>` doesn't implement `std::marker::Send`
-note: captured value is not `Send` because `&` references cannot be sent unless their referent is `Sync`
+note: future is not `Send` as this value is used across an await
+  --> tests/ui/future_not_send.rs:21:68
    |
    |
 LL | async fn private_future2(rc: Rc<[u8]>, cell: &Cell<usize>) -> bool {
-   |                                        ^^^^ has type `&std::cell::Cell<usize>` which is not `Send`, because `std::cell::Cell<usize>` is not `Sync`
+   |                                        ----                        ^ await occurs here, with `cell` maybe used later
    = note: `std::cell::Cell<usize>` doesn't implement `std::marker::Sync`
... 4 lines skipped ...
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ future returned by `public_future2` is not `Send`
    |
-note: captured value is not `Send`
-note: captured value is not `Send`
+note: future is not `Send` as this value is used across an await
-  --> tests/ui/future_not_send.rs:26:29
+  --> tests/ui/future_not_send.rs:26:43
    |
 LL | pub async fn public_future2(rc: Rc<[u8]>) {}
-   |                             ^^ has type `std::rc::Rc<[u8]>` which is not `Send`
+   |                             --            ^ await occurs here, with `rc` maybe used later
    = note: `std::rc::Rc<[u8]>` doesn't implement `std::marker::Send`
... 5 lines skipped ...
    |
 note: future is not `Send` as this value is used across an await
-  --> tests/ui/future_not_send.rs:40:24
-  --> tests/ui/future_not_send.rs:40:24
+  --> tests/ui/future_not_send.rs:38:45
    |
 LL |     async fn private_future(&self) -> usize {
-   |                             ----- has type `&Dummy` which is not `Send`
+   |                              ----           ^ await occurs here, with `self` maybe used later
+   |                              |
-LL |         async { true }.await;
-LL |         async { true }.await;
+   |                              has type `&Dummy` which is not `Send`
    = note: `std::rc::Rc<[u8]>` doesn't implement `std::marker::Sync`
... 4 lines skipped ...
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ future returned by `public_future` is not `Send`
    |
    |
-note: captured value is not `Send` because `&` references cannot be sent unless their referent is `Sync`
+note: future is not `Send` as this value is used across an await
+  --> tests/ui/future_not_send.rs:44:39
    |
 LL |     pub async fn public_future(&self) {
 LL |     pub async fn public_future(&self) {
-   |                                ^^^^^ has type `&Dummy` which is not `Send`, because `Dummy` is not `Sync`
+   |                                 ----  ^ await occurs here, with `self` maybe used later
    = note: `std::rc::Rc<[u8]>` doesn't implement `std::marker::Sync`
... 22 lines skipped ...
... 22 lines skipped ...
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ future returned by `unclear_future` is not `Send`
-note: captured value is not `Send`
+note: future is not `Send` as this value is used across an await
-  --> tests/ui/future_not_send.rs:73:28
+  --> tests/ui/future_not_send.rs:73:34
+  --> tests/ui/future_not_send.rs:73:34
    |
 LL | async fn unclear_future<T>(t: T) {}
-   |                            ^ has type `T` which is not `Send`
+   |                            -     ^ await occurs here, with `t` maybe used later
    = note: `T` doesn't implement `std::marker::Send`
 error: aborting due to 8 previous errors
 




FAILED TEST: tests/ui/large_futures.rs
command: CLIPPY_CONF_DIR="tests" RUSTC_ICE="0" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/clippy-driver" "--error-format=json" "--emit=metadata" "-Aunused" "-Ainternal_features" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-Dwarnings" "-Ldependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps" "--extern=clippy_config=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libclippy_config-a25255c93374782e.rlib" "--extern=clippy_lints=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libclippy_lints-786a1be58753b44d.rlib" "--extern=clippy_utils=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libclippy_utils-d1721b2d0d98b47c.rlib" "--extern=futures=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libfutures-a6101dcb295c4849.rlib" "--extern=if_chain=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libif_chain-49fd91c51cbf146b.rlib" "--extern=itertools=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libitertools-a4b4cb2e17d4d2b3.rlib" "--extern=parking_lot=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libparking_lot-4bc7250265c18c99.rlib" "--extern=quote=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libquote-81b55a36b7b44123.rlib" "--extern=regex=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libregex-b0d46a029c23c2f7.rlib" "--extern=serde=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libserde-568b05c39f76ae6b.rlib" "--extern=serde_derive=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libserde_derive-4b1ee1ecc6818b87.so" "--extern=syn=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libsyn-d8bb53642979c62d.rlib" "--extern=tokio=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libtokio-808e345eab6bb5e5.rlib" "-Ldependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/release/deps" "--out-dir" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/ui_test/tests/ui" "tests/ui/large_futures.rs" "--edition" "2021"
error: actual output differed from expected
Execute `cargo uibless` to update `tests/ui/large_futures.stderr` to the actual output
--- tests/ui/large_futures.stderr
+++ <stderr output>
+++ <stderr output>
 error: large future with a size of 16385 bytes
   --> tests/ui/large_futures.rs:10:9
... 29 lines skipped ...
    |     ^^^^^ help: consider `Box::pin` on it: `Box::pin(foo())`
-error: large future with a size of 49159 bytes
+error: large future with a size of 32774 bytes
   --> tests/ui/large_futures.rs:34:5
    |
    |
... 47 lines skipped ...
 error: aborting due to 8 previous errors
 


error: `large future with a size of 49159 bytes` not found in diagnostics on line 34
   |
35 |     //~^ ERROR: large future with a size of 49159 bytes
   |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected because of this pattern
   |
   |


FAILED TEST: tests/ui/needless_lifetimes.rs
command: CLIPPY_CONF_DIR="tests" RUSTC_ICE="0" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/clippy-driver" "--error-format=json" "--emit=metadata" "-Aunused" "-Ainternal_features" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-Dwarnings" "-Ldependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps" "--extern=clippy_config=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libclippy_config-a25255c93374782e.rlib" "--extern=clippy_lints=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libclippy_lints-786a1be58753b44d.rlib" "--extern=clippy_utils=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libclippy_utils-d1721b2d0d98b47c.rlib" "--extern=futures=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libfutures-a6101dcb295c4849.rlib" "--extern=if_chain=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libif_chain-49fd91c51cbf146b.rlib" "--extern=itertools=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libitertools-a4b4cb2e17d4d2b3.rlib" "--extern=parking_lot=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libparking_lot-4bc7250265c18c99.rlib" "--extern=quote=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libquote-81b55a36b7b44123.rlib" "--extern=regex=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libregex-b0d46a029c23c2f7.rlib" "--extern=serde=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libserde-568b05c39f76ae6b.rlib" "--extern=serde_derive=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libserde_derive-4b1ee1ecc6818b87.so" "--extern=syn=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libsyn-d8bb53642979c62d.rlib" "--extern=tokio=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libtokio-808e345eab6bb5e5.rlib" "-Ldependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/release/deps" "--out-dir" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/ui_test/tests/ui" "tests/ui/needless_lifetimes.rs" "--extern" "proc_macros=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/ui_test/tests/ui/auxiliary/libproc_macros.so" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/ui_test/tests/ui/auxiliary" "--edition" "2021"
error: actual output differed from expected
Execute `cargo uibless` to update `tests/ui/needless_lifetimes.stderr` to the actual output
--- tests/ui/needless_lifetimes.stderr
+++ <stderr output>
---



FAILED TEST: tests/ui/crashes/ice-10645.rs
command: CLIPPY_CONF_DIR="tests" RUSTC_ICE="0" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/clippy-driver" "--error-format=json" "--emit=metadata" "-Aunused" "-Ainternal_features" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-Dwarnings" "-Ldependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps" "--extern=clippy_config=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libclippy_config-a25255c93374782e.rlib" "--extern=clippy_lints=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libclippy_lints-786a1be58753b44d.rlib" "--extern=clippy_utils=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libclippy_utils-d1721b2d0d98b47c.rlib" "--extern=futures=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libfutures-a6101dcb295c4849.rlib" "--extern=if_chain=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libif_chain-49fd91c51cbf146b.rlib" "--extern=itertools=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libitertools-a4b4cb2e17d4d2b3.rlib" "--extern=parking_lot=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libparking_lot-4bc7250265c18c99.rlib" "--extern=quote=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libquote-81b55a36b7b44123.rlib" "--extern=regex=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libregex-b0d46a029c23c2f7.rlib" "--extern=serde=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libserde-568b05c39f76ae6b.rlib" "--extern=serde_derive=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libserde_derive-4b1ee1ecc6818b87.so" "--extern=syn=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libsyn-d8bb53642979c62d.rlib" "--extern=tokio=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/x86_64-unknown-linux-gnu/release/deps/libtokio-808e345eab6bb5e5.rlib" "-Ldependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/release/deps" "--out-dir" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools/ui_test/tests/ui/crashes" "tests/ui/crashes/ice-10645.rs" "--cap-lints=warn" "--edition" "2021"
error: actual output differed from expected
Execute `cargo uibless` to update `tests/ui/crashes/ice-10645.stderr` to the actual output
--- tests/ui/crashes/ice-10645.stderr
+++ <stderr output>
---
+note: future is not `Send` as this value is used across an await
-  --> tests/ui/crashes/ice-10645.rs:5:29
+  --> tests/ui/crashes/ice-10645.rs:5:35
    |
 LL | pub async fn bar<'a, T: 'a>(_: T) {}
-   |                             ^ has type `T` which is not `Send`
    = note: `T` doesn't implement `std::marker::Send`
    = note: `-D clippy::future-not-send` implied by `-D warnings`
... 2 lines skipped ...
 warning: 1 warning emitted

@dingxiangfei2009
Copy link
Contributor Author

dingxiangfei2009 commented Nov 10, 2024

I will come back to bless tests later.

I have been benchmarking with Tokio's own benchmark suite. It is probable that the benchmark does not work very much with big upvar captures. As a summary, the benchmark reports no observable improvement or regression on my charming little potato today. The numbers are more or less within the noise range. Do take my report with a grain of salt because the setup might be very amateur.

Results

All benchmarks reported no improvement or regression except one. This is spawn_many_local from rt_multi_threaded.rs. Out of caution I ran it with perf to collect the samples and then it instead reports no regression at all. It is probably a bug somehow and I will look into it on another day.

Vanilla

Here I am using the commit c22887b. I will attach the perf report here.

spawn_many_local        time:   [177.00 µs 177.59 µs 178.69 µs]
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

     Running rt_multi_threaded.rs (/home/dingxf/.cargo-target/release/deps/rt_multi_threaded-d10b38fd8b5d7166)
spawn_many_local        time:   [6.4936 ms 6.5263 ms 6.5603 ms]
                        change: [+3550.5% +3571.9% +3593.9%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)

By the way, direct invocation of this benchmark gives results like this one.

spawn_many_local        time:   [6.8079 ms 6.8550 ms 6.9035 ms]
                        change: [+0.0509% +1.0451% +2.0494%] (p = 0.04 < 0.05)
                        Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Relocation + Overlapping

Here I am using the HEAD of this branch, and force the layout code to use the new coroutine layout across the board. Specifically, in fn layout_of_uncached compiler/rustc_ty_utils/src/layout.rs use coroutine::coroutine_layout(..) and add #![allow(unused)] at the top for good measure.
Here is the perf report.

     Running rt_current_thread.rs (/home/dingxf/.cargo-target/release/deps/rt_current_thread-ebf737f66180f495)
spawn_many_local        time:   [180.18 µs 180.72 µs 181.40 µs]
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

     Running rt_multi_threaded.rs (/home/dingxf/.cargo-target/release/deps/rt_multi_threaded-d10b38fd8b5d7166)
spawn_many_local        time:   [6.7726 ms 6.8312 ms 6.8968 ms]
                        change: [+3650.7% +3689.3% +3732.0%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)

Direct invocation of the benchmark gives this result.

spawn_many_local        time:   [6.7402 ms 6.7841 ms 6.8283 ms]
                        change: [+1.0564% +2.1378% +3.1608%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) low mild

What comes next?

I think I will try to find some real-world application for benchmarking. I may take one of those benchmarks off the shelf at least including axum and friends, and I accept nomination!

@dingxiangfei2009
Copy link
Contributor Author

dingxiangfei2009 commented Nov 12, 2024

I was playing with TechEmpower/FrameworkBenchmarks. I got the numbers for Axum.

I do believe that there is a slight performance regression. My theory then is the initial unnecessary moves from upvar into body locals. If we are going to adopt the overlapping approach, we could make the body locals and upvars merged into the same slot, to elide the copy. This is probably a major break away from the initial approach but I think it is worth a try.

To reproduce the results

Build a distribution package

After applying the patch to use the new layout everywhere, I invoked src/ci/docker/run.sh dist-x86_64-linux. I had a problem with max user threads inside the container, so I added ulimit -u unlimited before the retry make prepare step in src/ci/run.sh. I also needed to switch off PGO in src/ci/docker/host-x86_64/dist-x86_64-linux/Dockerfile by changing the ENV SCRIPT=.. part to just invoke ./x dist. This gave me a rust-nightly-....tar.xz package.

Build toolchain image

I built a fresh toolchain image with the following Dockerfile created in the decompressed package and pushed the image to a local registry so that the DnD setup in the benchmark could find this image.

FROM debian:bookworm-slim

LABEL org.opencontainers.image.source=https://github.com/rust-lang/docker-rust

ENV RUSTUP_HOME=/usr/local/rustup \
    CARGO_HOME=/usr/local/cargo \
    PATH=/usr/local/cargo/bin:$PATH \
    RUST_VERSION=nightly

RUN set -eux; \
    apt-get update; \
    apt-get install -y --no-install-recommends \
        ca-certificates \
        gcc \
        libc6-dev \
        wget \
        ;
RUN --mount=type=bind,source=.,target=/app,readonly /app/install.sh

Run benchmark

I ran the benchmark with ./tfb --mode benchmark --test axum with changes to the first line in frameworks/Rust/axum.dockerfile, once without changes and once with a different builder pointing at the toolchain image on the local registry.

@alex-semenyuk
Copy link
Member

alex-semenyuk commented Feb 15, 2025

@dingxiangfei2009
Thanks for your contribution
From wg-triage. Is it ready for review? Also please solve merge conflicts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants