-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[For previewing, reviewing and testing] Move upvars to locals #135527
base: master
Are you sure you want to change the base?
[For previewing, reviewing and testing] Move upvars to locals #135527
Conversation
Some changes occurred in compiler/rustc_codegen_cranelift cc @bjorn3 Some changes occurred to the CTFE / Miri interpreter cc @rust-lang/miri Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt Some changes occurred to the CTFE machinery cc @rust-lang/wg-const-eval |
This comment has been minimized.
This comment has been minimized.
☔ The latest upstream changes (presumably #135715) made this pull request unmergeable. Please resolve the merge conflicts. |
I don't think this needs a reviewer? |
3e6a399
to
9603ad6
Compare
cc @Darksonn @tmandry @eholk @rust-lang/wg-async Ding here is reworking the layout of coroutines to try to reduce their memory footprint (and that of What do people think? |
This comment has been minimized.
This comment has been minimized.
For anyone searching for a description of what this PR changes, it's summarized at the top of compiler/rustc_mir_transform/src/coroutine/relocate_upvars.rs. |
//! The reason is that it is possible that coroutine layout may change and the source memory location of | ||
//! an upvar may not necessarily be mapped exactly to the same place as in the `Unresumed` state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we decide the offsets of upvars in Unresumed
in the same place as we decide the offset of saved locals? Couldn't we then "backpropagate" the field offsets for each upvar's local as the offset for the corresponding upvar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for reviewing! I had a backlog of things due to sickness.
True indeed. This statement is completely voided by the work in the second commit. I will reword this section in the following way.
By enabling the feature gate coroutine_new_layout
the field offsets of the upvars in Unresumed
state are further exactly placed in the same place as their corresponding saved locals, which is guaranteed by the alternative coroutine layout calculator that enters in effect. <... quote the relevant comment/file/etc. ...>
I don't personally have any means of performance testing this at the moment. It would be much easier if it landed behind a feature gate. |
☔ The latest upstream changes (presumably #135318) made this pull request unmergeable. Please resolve the merge conflicts. |
Cc @arielb1 who was also investigated this
…On Wed, Jan 29, 2025, at 7:56 PM, Tyler Mandry wrote:
***@***.**** commented on this pull request.
In compiler/rustc_mir_transform/src/coroutine/relocate_upvars.rs <#135527 (comment)>:
> +//! The reason is that it is possible that coroutine layout may change and the source memory location of
+//! an upvar may not necessarily be mapped exactly to the same place as in the `Unresumed` state.
Don't we decide the offsets of upvars in `Unresumed` in the same place as we decide the offset of saved locals? Couldn't we then "backpropagate" the field offsets for each upvar's local as the offset for the corresponding upvar?
—
Reply to this email directly, view it on GitHub <#135527 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABF4ZTFDPQDUNGH5L6MGSL2NF2CHAVCNFSM6AAAAABVG4UUZ2VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDKOBSGY4TKMZUHA>.
You are receiving this because you are on a team that was mentioned.Message ID: ***@***.***>
|
I think it is fair to land with a feature gate so that we can get to play with it. The PR has temporarily disabled the check on the feature gate. However, given that coroutine layout data is keyed individually by their |
Co-authored-by: Dario Nieuwenhuis <dirbaio@dirbaio.net>
9603ad6
to
3a1e04a
Compare
This comment has been minimized.
This comment has been minimized.
Co-authored-by: Dario Nieuwenhuis <dirbaio@dirbaio.net>
3a1e04a
to
61d4bbd
Compare
The job Click to see the possible cause of the failure (guessed by this bot)
|
Would this be better as a |
Are there any issues if only one crate activates it but others do not? if there are no issues, a feature gate seems ok (and easier to use ^^) |
☔ The latest upstream changes (presumably #137030) made this pull request unmergeable. Please resolve the merge conflicts. |
A feature doesn't allow turning it on for the whole build, you'd have to fork every single crate that uses async. A -Z flag would be better IMO. |
Agreed on a If my understanding is correct, we shouldn't expect any regression from this approach (only upside), but since we currently rely on later passes eliding copies there might be some regression. We could be more aggressive in eliding the copies ourselves, but maybe this is hard. |
Thanks for looking into this! I will have time this week to clean this up a bit and I will ask rustbot to set it to ready-for-review. |
Good day, this PR is related to #127522 and it is made easier to the public to test out a new coroutine/
async
state machine directly.Prepare the compiler for tests
For starter, you may build the compiler as prescribed in the
rustc-dev-guide
instruction. If a test in the docker container is desirable, you may build this compiler withsrc/ci/docker/run.sh dist-x86_64-linux --dev
forx86_64
and package the compiler with../x dist
to produce the artifacts inobj/dist-x86_64-linux/build/dist
. This Dockerfile gets you a working Rust builder image which allows you to build your Rust applications inbookworm
.The state of performance
So far with this patch, I have been studying the performance impact on the cases of
tokio
's single- and multi-threaded runtime, as well as a simpleaxum
HTTP service. As far as I can see, I can find a change in performance characteristics that are statistically significant, one-sidedp = 0.05
.This time, I would like to call for pooling in your valuable assessments and thoughts on this patch. I kindly request experiments from you and hopefully you can provide regression cases with
perf record -e cycles:u,instructions:u,cache-misses:u
reports.Thank you all so much! 🙇