Relocate coroutine upvars into Unresumed state #120168

dingxiangfei2009 · 2024-01-20T18:14:40Z

Related to #62958

This PR is an attempt to address the async/coroutine size issue by allowing independent def-use/liveness analysis on individual upvars in coroutines. It has appeared to address partially the size doubling issue introduced by the use of upvars.

However, there are caveats detailed in the following list that I would like to address before turning this draft in.

The treatment here towards the ty::Coroutine in MIR passes is unfortunately "messier" than my liking, which is something I definitely want to change. I propose to promote upvars into Body<'tcx> along with local_decls, so that we can safely handle them safely. I would happily open a new separate PR to improve the upvar management.
It is not a generic solution, yet. For instance, we are still doubling the size in the example of Async fn doubles argument size #62958. If we insert a pass before MIR type analysis to remove unnecessary drops, which we can, that particular size doubling will be solved. However, if a Future upvar is alive across more than one yield points, that upvar is still ineligible. It makes sense because we would like to minimize moving of variant fields. How to handle these upvars is not the focus of this PR for now.

Out of expectation of possible change in the high level plan, I am keeping this as a draft in hope of invoking conversations. 🙇

cc @pnkfelix for the context.

rustbot · 2024-01-20T18:14:48Z

r? @petrochenkov

(rustbot has picked a reviewer for you, use r? to override)

petrochenkov · 2024-01-21T10:50:57Z

r? @pnkfelix

bors · 2024-01-23T15:06:35Z

☔ The latest upstream changes (presumably #116152) made this pull request unmergeable. Please resolve the merge conflicts.

pnkfelix · 2024-02-01T14:55:32Z

@dingxiangfei2009 am I right in assuming that the debugger test failure is expected and you plan to address it if we move forward with this approach?

compiler/rustc_codegen_cranelift/src/base.rs

pnkfelix · 2024-02-02T04:13:34Z

In any case, this is interesting work, and I think its a step in the right direction.

I plan to build it locally and take it for a spin, to make sure I understand what its doing. @dingxiangfei2009, I may ping you on Zulip with follow-up questions.

dingxiangfei2009 · 2024-02-02T09:52:54Z

@pnkfelix Thank you for reviewing! Yes, the debuginfo test may very much depend on how the upvars are eventually represented. I figured that I shall fix them after we could settle with our approach first.

Please feel free to message on Zulip!

bors · 2024-02-06T17:20:18Z

☔ The latest upstream changes (presumably #120361) made this pull request unmergeable. Please resolve the merge conflicts.

pnkfelix · 2024-03-01T17:57:40Z

compiler/rustc_mir_transform/src/coroutine.rs

@@ -1105,11 +1168,25 @@ fn compute_layout<'tcx>(
            // around inside coroutines, so it doesn't matter which variant


This comment was cut off in the patch, it starts off with:

// Note that if a field is included in multiple variants, we will // just use the first one here. That's fine; fields do not move // around inside coroutines, so it doesn't matter which variant // index we access them by.

I had started writing this PR comment with a question asking if this code comment needs revision for consistency with this change. After reflecting on things, i think the whole picture here remains coherent, but now I want to double check my interpretation.

Before this patch, the upvars were disjoint from the fields carried across yields of a generator/coroutine, so the comment above simply didn't apply to upvars.

After this patch, the upvars are now mapped into the Unresumed variant of a generator/coroutine. But that isn't inconsistent with the comment above ... as long as the Unresumed variant is itself the first variant... right?

Actually the fact that the Unresumed variant is the first doesn't probably matter either.

The comment above is saying 'we pick the first', but it is also saying that the choice is irrelevant, since all variants that use the field should be using the same offset for that field.

So the crucial point is that the Unresumed variant should be adhering to that same invariant, that any uses of the upvar in other variants must likewise choose the same offset for that upvar in its respective variant.

pnkfelix · 2024-03-01T18:11:30Z

compiler/rustc_mir_transform/src/coroutine.rs

@@ -1221,6 +1298,21 @@ fn elaborate_coroutine_drops<'tcx>(tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
        );
    }
    elaborator.patch.apply(body);
+    for block_data in body.basic_blocks_mut() {


Is there some code up above this (or called by one of the methods above) that was in principle responsible for dropping the upvars, that is now made obsolete under the new regime you have introduced here?

pnkfelix · 2024-03-01T18:17:02Z

Thanks for your patience @dingxiangfei2009 . This all looks good to me; I wouldn't mind getting an answer to my question about the new handling of dropping upvar state, but I also think this is ready to land, assuming it doesn't conflict with @compiler-errors 's ongoing work on async closures etc.

pnkfelix · 2024-03-01T18:17:08Z

@rustbot author

pnkfelix · 2024-03-01T18:18:19Z

@bors try @rust-timer queue

bors · 2024-03-01T18:19:32Z

🔒 Merge conflict

This pull request and the master branch diverged in a way that cannot be automatically merged. Please rebase on top of the latest master branch, and let the reviewer approve again.

How do I rebase?

Assuming self is your fork and upstream is this repository, you can resolve the conflict following these steps:

git checkout coroutine-upvar (switch to your branch)
git fetch upstream master (retrieve the latest master)
git rebase upstream/master -p (rebase on top of it)
Follow the on-screen instruction to resolve conflicts (check git status if you got lost).
git push self coroutine-upvar --force-with-lease (update this PR)

You may also read Git Rebasing to Resolve Conflicts by Drew Blessing for a short tutorial.

Please avoid the "Resolve conflicts" button on GitHub. It uses git merge instead of git rebase which makes the PR commit history more difficult to read.

Sometimes step 4 will complete without asking for resolution. This is usually due to difference between how Cargo.lock conflict is handled during merge and rebase. This is normal, and you should still perform step 5 to update this PR.

Error message

Auto-merging tests/ui/coroutine/clone-impl.stderr
Auto-merging tests/ui/async-await/issue-70935-complex-spans.stderr
Auto-merging tests/ui/async-await/future-sizes/future-as-arg.rs
Auto-merging tests/ui/async-await/async-await-let-else.stderr
Auto-merging compiler/rustc_ty_utils/src/layout.rs
Auto-merging compiler/rustc_mir_transform/src/dest_prop.rs
Auto-merging compiler/rustc_mir_transform/src/coroutine.rs
Auto-merging compiler/rustc_mir_dataflow/src/rustc_peek.rs
Auto-merging compiler/rustc_middle/src/ty/sty.rs
Auto-merging compiler/rustc_middle/src/ty/layout.rs
Auto-merging compiler/rustc_middle/src/mir/tcx.rs
Auto-merging compiler/rustc_const_eval/src/transform/validate.rs
Auto-merging compiler/rustc_const_eval/src/interpret/step.rs
Auto-merging compiler/rustc_codegen_ssa/src/mir/rvalue.rs
Auto-merging compiler/rustc_codegen_llvm/src/debuginfo/metadata/enums/mod.rs
Auto-merging compiler/rustc_codegen_llvm/src/debuginfo/metadata.rs
Auto-merging compiler/rustc_codegen_cranelift/src/base.rs
Auto-merging compiler/rustc_borrowck/src/type_check/mod.rs
CONFLICT (content): Merge conflict in compiler/rustc_borrowck/src/type_check/mod.rs
Auto-merging compiler/rustc_abi/src/lib.rs
Automatic merge failed; fix conflicts and then commit the result.

pnkfelix · 2024-03-01T18:20:06Z

Oh, also: Just in case its potentially of interest to others, part of the reason this took me so long to get to is that I wanted a pair of pernos.co runs, one from before and one from after this PR, to make it easier for me to explore the control flow interactively (and to help me understand some of the differences in output from our internal compiler instrumentation like print-type-sizes)

Here's links to those in case anyone else wants to inspect them:

dingxiangfei2009 · 2024-03-04T11:01:45Z

Progress update: I am still on it to polish it up. I will try to get the rust-timer numbers this week.

bors · 2024-04-10T05:11:09Z

☔ The latest upstream changes (presumably #123708) made this pull request unmergeable. Please resolve the merge conflicts.

dingxiangfei2009 · 2024-04-10T12:19:31Z

@bors try @rust-timer queue

bors · 2024-04-10T12:19:33Z

@dingxiangfei2009: 🔑 Insufficient privileges: not in try users

dingxiangfei2009 · 2024-04-10T13:10:16Z

I see that @pnkfelix attempted to run a timer job earlier. Would you like to trigger the job again, given that there hasn't been any merge conflict now?

tmiasko · 2024-04-10T13:18:36Z

@bors try @rust-timer queue

bors · 2024-04-10T13:19:47Z

⌛ Trying commit a68481f with merge 6b8fde6...

…<try> Relocate coroutine upvars into Unresumed state Related to rust-lang#62958 This PR is an attempt to address the async/coroutine size issue by allowing independent def-use/liveness analysis on individual upvars in coroutines. It has appeared to address partially the size doubling issue introduced by the use of upvars. However, there are caveats detailed in the following list that I would like to address before turning this draft in. - The treatment here towards the `ty::Coroutine` in MIR passes is unfortunately "messier" than my liking, which is something I definitely want to change. I propose to promote upvars into `Body<'tcx>` along with `local_decls`, so that we can safely handle them safely. I would happily open a new separate PR to improve the upvar management. - It is not a generic solution, yet. For instance, we are still doubling the size in the example of rust-lang#62958. If we insert a pass before MIR type analysis to remove unnecessary drops, which we can, that particular size doubling will be solved. However, if a `Future` upvar is alive across more than one yield points, that upvar is still ineligible. It makes sense because we would like to minimize moving of variant fields. How to handle these upvars is not the focus of this PR for now. Out of expectation of possible change in the high level plan, I am keeping this as a draft in hope of invoking conversations. 🙇 cc `@pnkfelix` for the context.

bors · 2024-04-10T15:03:23Z

☀️ Try build successful - checks-actions
Build commit: 6b8fde6 (6b8fde695cc11a7d81125d3222430c419d510027)

rust-timer · 2024-04-10T16:43:05Z

Finished benchmarking commit (6b8fde6): comparison URL.

Overall result: ❌ regressions - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.6%	[0.2%, 0.8%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-2.9%	[-2.9%, -2.9%]	1
Improvements ✅ (secondary)	-2.8%	[-2.8%, -2.8%]	1
All ❌✅ (primary)	-2.9%	[-2.9%, -2.9%]	1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.2%	[1.2%, 1.2%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.8%	[1.8%, 1.8%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.3%	[-0.3%, -0.3%]	2
All ❌✅ (primary)	-	-	0

Bootstrap: 674.485s -> 672.751s (-0.26%)
Artifact size: 318.39 MiB -> 318.44 MiB (0.02%)

pnkfelix · 2024-04-12T14:19:16Z

compiler/rustc_mir_dataflow/src/framework/engine.rs

@@ -288,7 +288,7 @@ where
            }

            None if dump_enabled(tcx, A::NAME, def_id) => {
-                create_dump_file(tcx, ".dot", false, A::NAME, &pass_name.unwrap_or("-----"), body)?
+                create_dump_file(tcx, ".dot", true, A::NAME, &pass_name.unwrap_or("-----"), body)?


I don't object to this change, but it isn't strictly speaking necessary for this PR, is it?

pnkfelix · 2024-04-12T14:25:43Z

compiler/rustc_mir_dataflow/src/impls/liveness.rs

-pub struct TransferFunction<'a, T>(pub &'a mut T);
+pub struct TransferFunction<'a, T> {
+    pub trans: &'a mut T,
+    pub upvars: Option<(Local, usize)>,


I don't remember if earlier iterations on this PR had these changes to the various dataflow impls; its possible they did and I just overlooked them.

But seeing them now with fresh eyes, it just seems unfortunate that so many dataflow implementations seem like they are expected to duplicate this upvars: Option<(Local, usize)> state pattern. I would be very concerned about whether it is a footgun to forget to do it, and also whether it makes all of the dataflow systems harder to maintain.

Having said that, I also don't have a better suggestion about what to do for the short term. It probably is something to bring up with the broader team, in terms of explaining why the pattern is necessary and seeing if anyone has ideas for another way to accomplish the objective.

pnkfelix · 2024-04-12T14:34:22Z

I'm going to nominate this PR for discussion at next week's T-compiler meeting.

The essential question I want to pose is: Are we okay with the upvars: Option<(Local, usize)> this change is adding to various dataflow analyses, at least in the short term so that we can make forward progress on the problem of generator size explosion ?

Or should we wait to land this PR until after we have a discussion between @dingxiangfei2009, @pnkfelix, and whichever team members are interested in any of {dataflow, MIR representation, generator representation} to try to tease how what the "right answer" is here.

@rustbot label: +I-compiler-nominated

pnkfelix · 2024-04-18T15:03:36Z

Okay, the team discussed briefly at our team meeting.

@dingxiangfei2009 has opened up a dedicated zulip topic to discuss the matter, at: https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Seeking.20opinions.20about.20lifting.20upvars.20in.20coroutine.20and.20fu.2E.2E.2E

In the above zulip topic, @dingxiangfei2009 wrote:

I am still investigating if we can split the BitSet domain into Locals and Upvars. Despite the extensive exposure to assumptions that only true Locals exists, I believe that it is still doable, so that existing and new analyses can be sound again, whether they need to handle upvars or not.

In the team discussion, we generally agreed that we'd prefer to see @dingxiangfei2009 explore that split-bitset alternative design, because the approach in this PR seems more brittle than we would like, assuming there are better alternatives.

@rustbot author

pnkfelix · 2024-04-18T15:03:59Z

@rustbot label: -I-compiler-nominated

oskgo · 2024-07-26T05:34:40Z

Triage: Closing since this is replaced by #127522.

Note: if you are going to continue please open the PR BEFORE you push to it, else you won't be able to reopen - this is a quirk of github.
Thanks for your contribution.

@rustbot label: +S-inactive

rustbot assigned petrochenkov Jan 20, 2024

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jan 20, 2024

dingxiangfei2009 force-pushed the coroutine-upvar branch from 4b04bb1 to da13acc Compare January 20, 2024 18:39