Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cranelift: Add instructions for getting the current stack/frame/return pointers #4573

Merged

Conversation

fitzgen
Copy link
Member

@fitzgen fitzgen commented Aug 1, 2022

This is the initial part of #4535

TODO: I need to implement the s390x lowering for these new instructions.

@fitzgen fitzgen changed the title Cranelift: Add instruction for getting the current stack/frame/return pointers Cranelift: Add instructions for getting the current stack/frame/return pointers Aug 1, 2022
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few thoughts here (and apply the same comments to x64 as for aarch64 below) but this general approach looks fine; thanks!

cranelift/codegen/meta/src/shared/instructions.rs Outdated Show resolved Hide resolved
cranelift/codegen/src/isa/aarch64/inst/emit.rs Outdated Show resolved Hide resolved
cranelift/codegen/src/isa/aarch64/inst/emit.rs Outdated Show resolved Hide resolved
cranelift/codegen/src/isa/aarch64/inst/mod.rs Outdated Show resolved Hide resolved
cranelift/filetests/filetests/isa/x64/fp_sp_pc.clif Outdated Show resolved Hide resolved
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. isle Related to the ISLE domain-specific language labels Aug 2, 2022
@github-actions
Copy link

github-actions bot commented Aug 2, 2022

Subscribe to Label Action

cc @cfallin, @fitzgen

This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:machinst", "cranelift:area:x64", "cranelift:meta", "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@fitzgen fitzgen force-pushed the clif-stack-frame-return-pointer branch 2 times, most recently from 16e3fd6 to 9cd12ca Compare August 2, 2022 01:19
@fitzgen fitzgen force-pushed the clif-stack-frame-return-pointer branch from 9cd12ca to 2dd8b1c Compare August 2, 2022 01:19
@fitzgen fitzgen marked this pull request as ready for review August 2, 2022 01:20
@fitzgen fitzgen requested a review from cfallin August 2, 2022 01:20
@fitzgen
Copy link
Member Author

fitzgen commented Aug 2, 2022

@uweigand would you mind taking a look at the s390x-specific bits in here? I'm not 100% sure I got it right. I think you should be able to just look at the test expectations to quickly check here. Thanks!

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I was waffling back and forth re: the new rbp Amode below; on balance I think I like the approach you took better than the idea I had but I'll keep the comment for completeness. Probably enough just to add a comment describing why it's a separate Amode.

@@ -298,16 +298,29 @@ pub(crate) fn emit_std_enc_mem(

prefixes.emit(sink);

let mem_e = match mem_e.clone() {
Amode::RbpOffset { simm32, flags } => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to avoid providing the base reg as an operand to the operand collector, I guess?

I wonder if it might be cleaner, or less error-prone at least, to handle this case in the Amode::get_operands and Amode::with_allocs methods instead: skip providing rbp (or rsp) as an operand, and skip pulling an alloc from the AllocationConsumer if the original register was rbp (or rsp).

(I'm actually not sure; I'm going back and forth right now.)

The general concern I had with this is that we may not handle the same lowering everywhere. But actually I think everything funnels to this one lowering helper so maybe this is sufficient.

If we do stick to this approach, let's add a comment where we define the RbpOffset arm that it must be used for rbp (as opposed to the generic forms) because rbp cannot be given as an arg to regalloc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I actually like this better.

@bjorn3
Copy link
Contributor

bjorn3 commented Aug 2, 2022

How are these instructions expected to behave in the face of inlining?

@uweigand
Copy link
Member

uweigand commented Aug 2, 2022

@uweigand would you mind taking a look at the s390x-specific bits in here? I'm not 100% sure I got it right. I think you should be able to just look at the test expectations to quickly check here. Thanks!

Hi @fitzgen, the get_return_address implementation isn't correct in the general case. It looks like you're assuming r14 holds the return address throughout execution of the function? That isn't true, it is only guaranteed to hold the return address at the ABI boundaries, everywhere else it's just a normal allocatable general-purpose register.

For this is to be always correct, we instead have to read the return address from the stack slot where it was saved in the prologue. However, that doesn't happen in leaf functions - unless it is forced via the preserve_frame_pointers flag. Can we make that assumption for get_return_address? It seems it is already made for get_frame_pointer.

If so, we can read the return address from the slot at "initial stack pointer + 14*8". This could be done either via MemArg::InitialSPOffset (which currently isn't exposed to ISLE yet, but it should be straightforward to add a memarg_initial_stack_off or similar). In the alternative, the initial stack pointer actually equals the frame pointer (backchain) value, so it could also be implemented via something like (memarg_reg_plus_off (get_frame_pointer) 112 0 (memarg_trusted)).

Also, I'm a bit confused about why that new mov_preg thing is necessary. I'd have expected to use something like (copy_reg (writable_reg 15)) instead, or maybe even better some new (copy_reg (stack_reg)) - this is already used in other places to retrieve the value of a physical register.

(Or, in the alternative, even something like (load_addr (memarg_stack_off 0)) would work to get the stack pointer.)

We just special case getting operands from `Amode`s now.
@fitzgen
Copy link
Member Author

fitzgen commented Aug 2, 2022

How are these instructions expected to behave in the face of inlining?

We don't have inlining so it is a bit of a hypothetical question at the moment, but these instructions are about physical frames, so I imagine we would want to ignore logical frames that don't have a corresponding physical frame (i.e. they have been inlined) and just return information about the current physical frame regardless.

@fitzgen
Copy link
Member Author

fitzgen commented Aug 2, 2022

Also, I'm a bit confused about why that new mov_preg thing is necessary.

It is about getting an unallocatable physical register's value into a virtual register, and avoiding the pinned vregs that we want to remove support for from regalloc2, and which just existed in order to get the initial migration working.

@cfallin
Copy link
Member

cfallin commented Aug 2, 2022

Also, I'm a bit confused about why that new mov_preg thing is necessary. I'd have expected to use something like (copy_reg (writable_reg 15)) instead, or maybe even better some new (copy_reg (stack_reg)) - this is already used in other places to retrieve the value of a physical register.

This came out of some discussion that @fitzgen and I had: I would like to push the backends gradually away from using the "pinned vregs", or vregs that represent physical registers directly, and instead use operand constraints wherever possible. This is because handling moves-from-pinned and moves-to-pinned-vregs introduces a bunch of special cases in regalloc, and leads to worse performance (it uses a whole instruction, the move, to communicate what should be just a constraint) and sometimes bugs (e.g. bytecodealliance/regalloc2#60). Just yesterday I was helping to find the cause for a weird mov rA, rB; mov rB, rA sequence and it turned out to be because of moves-to-pregs before a call (in x64 in this case) where operand constraints would have led to properly elided moves.

So it's a bit forward-looking, but there is a reason! At some point later I (or someone) will go through and systematically update the remaining uses.

@fitzgen
Copy link
Member Author

fitzgen commented Aug 2, 2022

Okay I pushed a couple new commits:

  • switched x64 away from the new variant of Amode and just check the base register when getting operands for Amode::ImmReg, which I think is a bit nicer.
  • Fixed s390x per @uweigand's feedback (thanks! would appreciate if you could give this another quick once over)

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few tidbits on the RbpOffset -> ImmReg change (looks good overall though!).

cranelift/codegen/src/isa/x64/inst/args.rs Show resolved Hide resolved
cranelift/codegen/src/isa/x64/inst/args.rs Show resolved Hide resolved
@uweigand
Copy link
Member

uweigand commented Aug 2, 2022

Okay I pushed a couple new commits:
* Fixed s390x per @uweigand's feedback (thanks! would appreciate if you could give this another quick once over)

LGTM. I've added a couple of inline comments, but those are really just cosmetic.

@uweigand
Copy link
Member

uweigand commented Aug 2, 2022

Okay I pushed a couple new commits:

  • Fixed s390x per @uweigand's feedback (thanks! would appreciate if you could give this another quick once over)

LGTM. I've added a couple of inline comments, but those are really just cosmetic.

Oh, maybe one more thing comes to mind: it would be great if the affected ISLE patterns actually verified that preserve_frame_pointer is true if they depend on that fact. Maybe some predicate in prelude.isle that could be used like so:

(rule (lower (get_frame_pointer))
      (if (preserve_frame_pointer))
      (load64 (memarg_stack_off 0 0)))

Then we'd get a compile-time error instead of random run-time crashes if the assumption was violated.

@uweigand
Copy link
Member

uweigand commented Aug 2, 2022

I would like to push the backends gradually away from using the "pinned vregs", or vregs that represent physical registers directly, and instead use operand constraints wherever possible.

I see, thanks for the explanation!

@fitzgen
Copy link
Member Author

fitzgen commented Aug 2, 2022

Oh, maybe one more thing comes to mind: it would be great if the affected ISLE patterns actually verified that preserve_frame_pointer is true if they depend on that fact.

The CLIF verifier does check this condition, so by the time we lower the CLIF, we know we have valid CLIF, and this check should be unnecessary. I wouldn't mind the redundancy if it were a simple one-line debug assertion, but since it would involve new external ISLE extractors, I don't think it is quite worth it.

@fitzgen fitzgen merged commit 42bba45 into bytecodealliance:main Aug 2, 2022
@fitzgen fitzgen deleted the clif-stack-frame-return-pointer branch August 2, 2022 21:37
@uweigand
Copy link
Member

uweigand commented Aug 2, 2022

Oh, maybe one more thing comes to mind: it would be great if the affected ISLE patterns actually verified that preserve_frame_pointer is true if they depend on that fact.

The CLIF verifier does check this condition

Ah, good! I hadn't see that.

@bjorn3
Copy link
Contributor

bjorn3 commented Aug 3, 2022

I didn't see an answer to my question in #4573 (comment).

@cfallin
Copy link
Member

cfallin commented Aug 3, 2022

I didn't see an answer to my question in #4573 (comment).

@fitzgen replied to your comment in this comment.

@bjorn3
Copy link
Contributor

bjorn3 commented Aug 4, 2022

Missed that comment. Thanks @cfallin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants