Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc_codegen_ssa: turn builders "unpositioned" after emitting a terminator. #84771

Closed
wants to merge 6 commits into from

Conversation

eddyb
Copy link
Member

@eddyb eddyb commented May 1, 2021

This PR introduces the concept of an "unpositioned builder" - that is, given Bx: BuilderMethods, there is also a Bx::Unpositioned type, which is like Bx but isn't known to be in a valid position.

As such, it doesn't have methods (like those from BuilderMethods) to add instructions - the only valid use of an "unpositioned builder" (a Bx::Unpositioned) is to use position_at_end to turn it into a usable builder (a Bx).

(Bikeshed material: is the name too unwieldy? I was considering replacing "unpositioned" with "detached", which is both shorter and has a suitable verb, "detach", available)

The main use of this new split in builders is terminators: instead of simply mutating the builder like any other instruction-emitting method, adding a terminator will now take the builder by value (self instead of &mut self), and return it as an "unpositioned builder".

(Most of the commits in this PR are necessary refactors for the main change, and should be reviewed separately)

There's two reasons I think we should make terminators behave like that:

  • correctness/safety: a block can only have at most one terminator, and no other instructions after it, but LLVM in particular may let us keep appending, since it just chains instructions for each block
    • with the new approach, you lose the ability to build more in that block because you lose your builder and get back the "unpositioned" one that has no functionality without first (re)positioning it
    • by keeping around the backend object (e.g. a LLVM IRBuilder) inside the "unpositioned builder", we can safely reuse it, so no resources are being wasted (reuse isn't done in this PR, but may be in the future)
    • additionally, being able to express (e.g. within rustc_codegen_ssa::mir) "builds within a block" as "taking &mut Bx" and "may introduce control-flow" as "taking Bx and returning Bx::Unpositioned", makes it possible to reason (albeit in a limited manner) about some codegen functionality without investigating the exact logic
  • as hinted to in [HACK] rustc_codegen_ssa: remove micro-opt around cleanup_ret trampoline. #84403, I hope this can allow backends to take advantage of more restricted builder usage
    • in particular, the ability to transition Bx -> Bx::Unpositioned -> Bx (i.e. finishing a block and moving to another), is a clean and safe way to keep reusing the same builder, potentially a single builder overall, for building multiple/all blocks in a function - in particular, something taking Bx::Unpositioned can build its own blocks but not trample yours
    • Rust-GPU's rustc_codegen_spirv would benefit from this (and is the reason I started down this path), as it right now has to use interior mutability, and has no way to statically enforce correct accesses, because it has unique ownership of the IR, unlike LLVM's pointer graph that you can go from anywhere to anywhere else on
    • but rustc_codegen_llvm could also benefit, if we go down this path and eventually have unique &mut CodegenCx - we could then remove interior mutability from CodegenCx fields and mutate them directly

r? @nagisa cc @bjorn3

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 1, 2021
@eddyb
Copy link
Member Author

eddyb commented May 1, 2021

@bors try @rustc-timer queue

@bors
Copy link
Contributor

bors commented May 1, 2021

⌛ Trying commit 78e2ac0 with merge a60ed4cb5b42df993a72660af1e01e140ab753f3...

@bors
Copy link
Contributor

bors commented May 1, 2021

☀️ Try build successful - checks-actions
Build commit: a60ed4cb5b42df993a72660af1e01e140ab753f3 (a60ed4cb5b42df993a72660af1e01e140ab753f3)

@eddyb

This comment has been minimized.

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 1, 2021
@eddyb
Copy link
Member Author

eddyb commented May 1, 2021

Nope, try build already done, right:

@rust-timer build a60ed4cb5b42df993a72660af1e01e140ab753f3

@rust-timer
Copy link
Collaborator

Queued a60ed4cb5b42df993a72660af1e01e140ab753f3 with parent 8a9fa36, future comparison URL.


fn unpositioned(cx: &'a Self::CodegenCx) -> Self::Unpositioned;
fn position_at_end(bx: Self::Unpositioned, llbb: Self::BasicBlock) -> Self;
fn into_unpositioned(self) -> Self::Unpositioned;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

into_unpositioned is only used inside cg_llvm. Can you remove it from BuilderMethods?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh it's only used there as a shorthand, the main usecase would be hopping between blocks in cg_ssa.
I suppose I could take it out for now, it's trivial to readd later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this hopping between blocks really needed? Can't a block be codegened all at once? That would be necessary once I switch cg_clif to use cg_ssa as cranelift's builder api requires terminating a block before you can switch to the next block.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(replied in #84771 (comment) to avoid losing it once I resolve this)

fn unpositioned(cx: &'a Self::CodegenCx) -> Self::Unpositioned;
fn position_at_end(bx: Self::Unpositioned, llbb: Self::BasicBlock) -> Self;
fn into_unpositioned(self) -> Self::Unpositioned;

fn new_block<'b>(cx: &'a Self::CodegenCx, llfn: Self::Function, name: &'b str) -> Self;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to make this return Self::BasicBlock in a future PR? That should allow using only a single builder for each function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the next step I want to take is to rework how blocks are generated and reuse builders more, but I didn't want to cram too much into one PR at once.

self.llblock(fx, cleanup),
self.funclet(fx),
Some(&fn_abi),
);

if let Some((ret_dest, target)) = destination {
let mut ret_bx = fx.build_block(target);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to re-use the unpositioned_bx here I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yupp! There's a lot of places like this, that I would want to tackle if this PR gets merged.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (a60ed4cb5b42df993a72660af1e01e140ab753f3): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 1, 2021
@bjorn3
Copy link
Member

bjorn3 commented May 1, 2021

Slight regression up to 0.3% with a 0.4% regression in bootstrap time.

@eddyb
Copy link
Member Author

eddyb commented May 1, 2021

That's much worse than I was expecting (at least in the sense that it's very consistent), I might want to take a closer look at all the changes.

There shouldn't be a significant cost to passing these builder (Bx/Bx::Unpositioned) values around, since they're going to be just pointer pairs.

fn add_handler(&mut self, catch_switch: &'ll Value, handler: &'ll BasicBlock) {
unsafe {
llvm::LLVMRustAddHandler(catch_switch, handler);
let catch_switch = ret.expect("LLVM does not have support for catchswitch");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe its been a while since the lowest version of LLVM that we support supports catchswitch (it was introduced in, like, 3.8 or something). This and the Option in the return value can be removed entirely, I think.

@eddyb
Copy link
Member Author

eddyb commented May 1, 2021

Replying to #84771 (comment):

Is this hopping between blocks really needed? Can't a block be codegened all at once? That would be necessary once I switch cg_clif to use cg_ssa as cranelift's builder api requires terminating a block before you can switch to the next block.

I see, rustc_codegen_spirv can handle hopping, it just really dislikes multiple builder objects. I forgot Cranelift's builder is even stricter.

So avoiding hopping is why I opened #84699, which I had given up on. It sounds like I'm going to need either that or a clever way to "allocate blocks to build later" (EDIT: see below, I realized said "clever way" while writing the comment).

The fact that there's caching means there's a possibility I haven't considered:

  1. pre-generate as little as possible
  2. build the MIR blocks, generating resources like Bx::BasicBlocks as-needed, and storing them inside the FunctionCx
  3. fill in any of the resources created in step 2 (like various empty Bx::BasicBlocks), using only the reason for them being generated (such as the unreachable block, or per-MIR-block landing_pad blocks)

This has yet another cool side-effect that we can get rid of the "delete blocks not visited by RPO" step because we could generate all Bx::BasicBlocks when they're first needed, instead of ahead-of-time.

@bjorn3
Copy link
Member

bjorn3 commented May 1, 2021

So avoiding hopping is why I opened #84699, which I had given up on. It sounds like I'm going to need either that or a clever way to "allocate blocks to build later".

Cranelift does allow allocating new blocks while the current block is being built as well as keeping blocks that are never positioned to empty. It only restricts switching to a new block. Or did I misunderstand the intent of #84699?

@eddyb
Copy link
Member Author

eddyb commented May 1, 2021

Cranelift does allow allocating new blocks

Sorry, I didn't mean to imply otherwise - what I meant is knowing "what to do later" - allocating the block is the easy part, but I didn't realize initially that it wasn't that hard to know how to build the block later on.

@nagisa
Copy link
Member

nagisa commented May 1, 2021

This seems reasonable enough to me.

@nagisa
Copy link
Member

nagisa commented May 7, 2021

@bjorn3 I'm not sure if you have any lasting concerns or not. The PR looks good enough to me implementation wise, but I don't have much insight into the limitations implied by the backends other than the LLVM one.

@eddyb
Copy link
Member Author

eddyb commented May 7, 2021

@nagisa So in terms of what I want to do, I think I want to keep at #84771 (comment) and only after get back to something like this, which should look better by then. I might combine approaches if they end up intertwined while working on it.

This is roughly why I opened #84993 separately, FWIW.

@bjorn3
Copy link
Member

bjorn3 commented May 7, 2021

@nagisa I don't have any concerns. All my review comments are things that can be done in the future because of this PR. This PR is strictly an improvement IMHO.

@nagisa
Copy link
Member

nagisa commented May 9, 2021

@eddyb okay, SGTM. To be fair the comments suggesting preference for one PR over the other are quite confusing. On this PR you're saying that you'd like to pursue #84699 before this, and on #84699 there's a comment saying that this PR could obsolete #84699, which is why I wasn't too sure.

@eddyb
Copy link
Member Author

eddyb commented May 15, 2021

@nagisa Oh I see the confusion, there was some unfortunate phrasing (besides the bad timings and all the dead-ends).

The two options for avoiding block-hopping were:

  1. create everything ahead of time (rustc_codegen_ssa: eagerly create landing pad blocks. #84699 was doing this for landing pad blocks, but I'd also need to to handle cleanup_ret blocks, the unreachable block, etc.)
  2. allocate just the empty blocks and somehow record that some building will have to be done in it later
    • a strawman version of this could be "pushing to a Vec<Box<dyn Fn(Bx)>>"
    • but more realistically, I realized that there are various caches which perfectly fit the bill of recording "these empty blocks were needed for X/Y/Z", which is enough to build into them later

So the confusion was that I only realized 2. was practical half-way through describing this, and didn't properly reformulate it as obsoleting 1.

I'll mark this PR as draft, and try to keep only one non-draft PR at a time, and hopefully that will help streamline them.

@eddyb eddyb marked this pull request as draft May 15, 2021 11:04
bors added a commit to rust-lang-ci/rust that referenced this pull request May 16, 2021
rustc_codegen_ssa: only create backend `BasicBlock`s as-needed.

Instead of creating one backend (e.g. LLVM) block per MIR block ahead of time, and then deleting the ones that weren't visited, this PR moves to creating the blocks as they're needed (either reached via the RPO visit, or used as the target of a branch from a different block).

As deleting a block was the only `unsafe` builder method (generally we only *create* backend objects, not *remove* them), that's gone now and codegen is overall a bit safer.

The only change in output is the order of LLVM blocks (which AFAIK has no semantic meaning, other than the first block being the entry block). This happens because the blocks are now created due to control-flow edges, rather than MIR block order.

I'm making this a standalone PR because I keep getting wild perf results when I change *anything* in codegen, but if you want to read more about my plans in this area, see rust-lang#84771 (comment) (and rust-lang#84771 (comment) - but that may be a bit outdated).

(You may notice some of the APIs in this PR, like `append_block`, don't help with the future plans - but I didn't want to include the necessary refactors that pass a build around everywhere, in this PR, so it's a small compromise)

r? `@nagisa` `@bjorn3`
@nagisa nagisa added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 25, 2021
@nagisa
Copy link
Member

nagisa commented Jun 25, 2021

Market this as waiting-on-author. Primarily just for my own book-keeping. Feel free to mark waiting-on-review once this is ready.

@jackh726 jackh726 added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Dec 18, 2021
@nagisa
Copy link
Member

nagisa commented Apr 10, 2022

Lets close this for now. While this feels like something we are interested in doing, it seems like it might take some effort to bring it over the finish line, and it isn't clear if it'll materialize any time soon.

@nagisa nagisa closed this Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants