Cranelift: Add instructions for getting the current stack/frame/return pointers #4573

fitzgen · 2022-08-01T23:23:01Z

This is the initial part of #4535

~~TODO: I need to implement the s390x lowering for these new instructions.~~

cfallin

A few thoughts here (and apply the same comments to x64 as for aarch64 below) but this general approach looks fine; thanks!

cranelift/codegen/meta/src/shared/instructions.rs

cranelift/codegen/src/isa/aarch64/inst/emit.rs

cranelift/codegen/src/isa/aarch64/inst/mod.rs

cranelift/filetests/filetests/isa/x64/fp_sp_pc.clif

github-actions · 2022-08-02T00:04:58Z

Subscribe to Label Action

cc @cfallin, @fitzgen

This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:machinst", "cranelift:area:x64", "cranelift:meta", "isle"

Thus the following users have been cc'd because of the following labels:

cfallin: isle
fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

…ers and return address This is the initial part of bytecodealliance#4535

fitzgen · 2022-08-02T01:23:34Z

@uweigand would you mind taking a look at the s390x-specific bits in here? I'm not 100% sure I got it right. I think you should be able to just look at the test expectations to quickly check here. Thanks!

cfallin

Looks good to me! I was waffling back and forth re: the new rbp Amode below; on balance I think I like the approach you took better than the idea I had but I'll keep the comment for completeness. Probably enough just to add a comment describing why it's a separate Amode.

cfallin · 2022-08-02T01:33:45Z

cranelift/codegen/src/isa/x64/encoding/rex.rs

@@ -298,16 +298,29 @@ pub(crate) fn emit_std_enc_mem(

    prefixes.emit(sink);

+    let mem_e = match mem_e.clone() {
+        Amode::RbpOffset { simm32, flags } => {


This is to avoid providing the base reg as an operand to the operand collector, I guess?

I wonder if it might be cleaner, or less error-prone at least, to handle this case in the Amode::get_operands and Amode::with_allocs methods instead: skip providing rbp (or rsp) as an operand, and skip pulling an alloc from the AllocationConsumer if the original register was rbp (or rsp).

(I'm actually not sure; I'm going back and forth right now.)

The general concern I had with this is that we may not handle the same lowering everywhere. But actually I think everything funnels to this one lowering helper so maybe this is sufficient.

If we do stick to this approach, let's add a comment where we define the RbpOffset arm that it must be used for rbp (as opposed to the generic forms) because rbp cannot be given as an arg to regalloc.

Yeah I actually like this better.

bjorn3 · 2022-08-02T06:46:03Z

How are these instructions expected to behave in the face of inlining?

uweigand · 2022-08-02T12:31:19Z

@uweigand would you mind taking a look at the s390x-specific bits in here? I'm not 100% sure I got it right. I think you should be able to just look at the test expectations to quickly check here. Thanks!

Hi @fitzgen, the get_return_address implementation isn't correct in the general case. It looks like you're assuming r14 holds the return address throughout execution of the function? That isn't true, it is only guaranteed to hold the return address at the ABI boundaries, everywhere else it's just a normal allocatable general-purpose register.

For this is to be always correct, we instead have to read the return address from the stack slot where it was saved in the prologue. However, that doesn't happen in leaf functions - unless it is forced via the preserve_frame_pointers flag. Can we make that assumption for get_return_address? It seems it is already made for get_frame_pointer.

If so, we can read the return address from the slot at "initial stack pointer + 14*8". This could be done either via MemArg::InitialSPOffset (which currently isn't exposed to ISLE yet, but it should be straightforward to add a memarg_initial_stack_off or similar). In the alternative, the initial stack pointer actually equals the frame pointer (backchain) value, so it could also be implemented via something like (memarg_reg_plus_off (get_frame_pointer) 112 0 (memarg_trusted)).

Also, I'm a bit confused about why that new mov_preg thing is necessary. I'd have expected to use something like (copy_reg (writable_reg 15)) instead, or maybe even better some new (copy_reg (stack_reg)) - this is already used in other places to retrieve the value of a physical register.

(Or, in the alternative, even something like (load_addr (memarg_stack_off 0)) would work to get the stack pointer.)

We just special case getting operands from `Amode`s now.

fitzgen · 2022-08-02T16:44:55Z

How are these instructions expected to behave in the face of inlining?

We don't have inlining so it is a bit of a hypothetical question at the moment, but these instructions are about physical frames, so I imagine we would want to ignore logical frames that don't have a corresponding physical frame (i.e. they have been inlined) and just return information about the current physical frame regardless.

fitzgen · 2022-08-02T16:46:57Z

Also, I'm a bit confused about why that new mov_preg thing is necessary.

It is about getting an unallocatable physical register's value into a virtual register, and avoiding the pinned vregs that we want to remove support for from regalloc2, and which just existed in order to get the initial migration working.

cfallin · 2022-08-02T16:48:59Z

Also, I'm a bit confused about why that new mov_preg thing is necessary. I'd have expected to use something like (copy_reg (writable_reg 15)) instead, or maybe even better some new (copy_reg (stack_reg)) - this is already used in other places to retrieve the value of a physical register.

This came out of some discussion that @fitzgen and I had: I would like to push the backends gradually away from using the "pinned vregs", or vregs that represent physical registers directly, and instead use operand constraints wherever possible. This is because handling moves-from-pinned and moves-to-pinned-vregs introduces a bunch of special cases in regalloc, and leads to worse performance (it uses a whole instruction, the move, to communicate what should be just a constraint) and sometimes bugs (e.g. bytecodealliance/regalloc2#60). Just yesterday I was helping to find the cause for a weird mov rA, rB; mov rB, rA sequence and it turned out to be because of moves-to-pregs before a call (in x64 in this case) where operand constraints would have led to properly elided moves.

So it's a bit forward-looking, but there is a reason! At some point later I (or someone) will go through and systematically update the remaining uses.

fitzgen · 2022-08-02T17:09:10Z

Okay I pushed a couple new commits:

switched x64 away from the new variant of Amode and just check the base register when getting operands for Amode::ImmReg, which I think is a bit nicer.
Fixed s390x per @uweigand's feedback (thanks! would appreciate if you could give this another quick once over)

cfallin

A few tidbits on the RbpOffset -> ImmReg change (looks good overall though!).

cranelift/codegen/src/isa/x64/inst/args.rs

cranelift/codegen/src/isa/s390x/inst.isle

cranelift/codegen/src/isa/s390x/inst/emit.rs

uweigand · 2022-08-02T17:44:14Z

Okay I pushed a couple new commits:
* Fixed s390x per @uweigand's feedback (thanks! would appreciate if you could give this another quick once over)

LGTM. I've added a couple of inline comments, but those are really just cosmetic.

uweigand · 2022-08-02T17:47:46Z

Okay I pushed a couple new commits:

Fixed s390x per @uweigand's feedback (thanks! would appreciate if you could give this another quick once over)

LGTM. I've added a couple of inline comments, but those are really just cosmetic.

Oh, maybe one more thing comes to mind: it would be great if the affected ISLE patterns actually verified that preserve_frame_pointer is true if they depend on that fact. Maybe some predicate in prelude.isle that could be used like so:

(rule (lower (get_frame_pointer))
      (if (preserve_frame_pointer))
      (load64 (memarg_stack_off 0 0)))

Then we'd get a compile-time error instead of random run-time crashes if the assumption was violated.

uweigand · 2022-08-02T17:48:18Z

I would like to push the backends gradually away from using the "pinned vregs", or vregs that represent physical registers directly, and instead use operand constraints wherever possible.

I see, thanks for the explanation!

fitzgen · 2022-08-02T20:11:20Z

Oh, maybe one more thing comes to mind: it would be great if the affected ISLE patterns actually verified that preserve_frame_pointer is true if they depend on that fact.

The CLIF verifier does check this condition, so by the time we lower the CLIF, we know we have valid CLIF, and this check should be unnecessary. I wouldn't mind the redundancy if it were a simple one-line debug assertion, but since it would involve new external ISLE extractors, I don't think it is quite worth it.

… `MovPReg`

uweigand · 2022-08-02T22:35:28Z

Oh, maybe one more thing comes to mind: it would be great if the affected ISLE patterns actually verified that preserve_frame_pointer is true if they depend on that fact.

The CLIF verifier does check this condition

Ah, good! I hadn't see that.

bjorn3 · 2022-08-03T18:20:09Z

I didn't see an answer to my question in #4573 (comment).

cfallin · 2022-08-03T18:40:37Z

I didn't see an answer to my question in #4573 (comment).

@fitzgen replied to your comment in this comment.

bjorn3 · 2022-08-04T12:51:47Z

Missed that comment. Thanks @cfallin.

fitzgen changed the title ~~Cranelift: Add instruction for getting the current stack/frame/return pointers~~ Cranelift: Add instructions for getting the current stack/frame/return pointers Aug 1, 2022

cfallin reviewed Aug 1, 2022

View reviewed changes

fitzgen force-pushed the clif-stack-frame-return-pointer branch 2 times, most recently from 16e3fd6 to 9cd12ca Compare August 2, 2022 01:19

Cranelift: Add instructions for getting the current stack/frame point…

2dd8b1c

…ers and return address This is the initial part of bytecodealliance#4535

fitzgen force-pushed the clif-stack-frame-return-pointer branch from 9cd12ca to 2dd8b1c Compare August 2, 2022 01:19

fitzgen marked this pull request as ready for review August 2, 2022 01:20

fitzgen requested a review from cfallin August 2, 2022 01:20

cfallin approved these changes Aug 2, 2022

View reviewed changes

x64: Remove Amode::RbpOffset and use Amode::ImmReg instead

e866c5f

We just special case getting operands from `Amode`s now.

Fix s390x get_return_address; require preserve_frame_pointers=true

e227328

cfallin reviewed Aug 2, 2022

View reviewed changes

cranelift/codegen/src/isa/x64/inst/args.rs Show resolved Hide resolved

cranelift/codegen/src/isa/x64/inst/args.rs Show resolved Hide resolved

uweigand reviewed Aug 2, 2022

View reviewed changes

cranelift/codegen/src/isa/s390x/inst.isle Outdated Show resolved Hide resolved

cranelift/codegen/src/isa/s390x/inst/emit.rs Outdated Show resolved Hide resolved

fitzgen added 2 commits August 2, 2022 13:05

Assert that Amode::ImmRegRegShift doesn't use rbp/rsp

45bf4cb

Handle non-allocatable registers in Amode::with_allocs

f10c6fd

fitzgen added 2 commits August 2, 2022 13:12

Use "stack" instead of "r15" on s390x

ad01c28

r14 is an allocatable register on s390x, so it shouldn't be used with…

3ba1e8f

… `MovPReg`

fitzgen merged commit 42bba45 into bytecodealliance:main Aug 2, 2022

fitzgen deleted the clif-stack-frame-return-pointer branch August 2, 2022 21:37

cfallin mentioned this pull request Aug 4, 2022

Remove the need for inline assembly trampolines used by Wasmtime #4611

Closed

jameysharp mentioned this pull request Oct 27, 2022

Several questions about using Cranelift as a backend for a programming language #5141

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cranelift: Add instructions for getting the current stack/frame/return pointers #4573

Cranelift: Add instructions for getting the current stack/frame/return pointers #4573

fitzgen commented Aug 1, 2022 •

edited

Loading

cfallin left a comment

github-actions bot commented Aug 2, 2022

fitzgen commented Aug 2, 2022

cfallin left a comment

cfallin Aug 2, 2022

fitzgen Aug 2, 2022

bjorn3 commented Aug 2, 2022

uweigand commented Aug 2, 2022

fitzgen commented Aug 2, 2022

fitzgen commented Aug 2, 2022

cfallin commented Aug 2, 2022

fitzgen commented Aug 2, 2022

cfallin left a comment

uweigand commented Aug 2, 2022

uweigand commented Aug 2, 2022

uweigand commented Aug 2, 2022

fitzgen commented Aug 2, 2022

uweigand commented Aug 2, 2022

bjorn3 commented Aug 3, 2022

cfallin commented Aug 3, 2022

bjorn3 commented Aug 4, 2022

Cranelift: Add instructions for getting the current stack/frame/return pointers #4573

Cranelift: Add instructions for getting the current stack/frame/return pointers #4573

Conversation

fitzgen commented Aug 1, 2022 • edited Loading

cfallin left a comment

Choose a reason for hiding this comment

github-actions bot commented Aug 2, 2022

Subscribe to Label Action

fitzgen commented Aug 2, 2022

cfallin left a comment

Choose a reason for hiding this comment

cfallin Aug 2, 2022

Choose a reason for hiding this comment

fitzgen Aug 2, 2022

Choose a reason for hiding this comment

bjorn3 commented Aug 2, 2022

uweigand commented Aug 2, 2022

fitzgen commented Aug 2, 2022

fitzgen commented Aug 2, 2022

cfallin commented Aug 2, 2022

fitzgen commented Aug 2, 2022

cfallin left a comment

Choose a reason for hiding this comment

uweigand commented Aug 2, 2022

uweigand commented Aug 2, 2022

uweigand commented Aug 2, 2022

fitzgen commented Aug 2, 2022

uweigand commented Aug 2, 2022

bjorn3 commented Aug 3, 2022

cfallin commented Aug 3, 2022

bjorn3 commented Aug 4, 2022

fitzgen commented Aug 1, 2022 •

edited

Loading