-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cranelift: Add instructions for getting the current stack/frame/return pointers #4573
Cranelift: Add instructions for getting the current stack/frame/return pointers #4573
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few thoughts here (and apply the same comments to x64 as for aarch64 below) but this general approach looks fine; thanks!
Subscribe to Label Action
This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:machinst", "cranelift:area:x64", "cranelift:meta", "isle"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
16e3fd6
to
9cd12ca
Compare
…ers and return address This is the initial part of bytecodealliance#4535
9cd12ca
to
2dd8b1c
Compare
@uweigand would you mind taking a look at the s390x-specific bits in here? I'm not 100% sure I got it right. I think you should be able to just look at the test expectations to quickly check here. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! I was waffling back and forth re: the new rbp Amode
below; on balance I think I like the approach you took better than the idea I had but I'll keep the comment for completeness. Probably enough just to add a comment describing why it's a separate Amode.
@@ -298,16 +298,29 @@ pub(crate) fn emit_std_enc_mem( | |||
|
|||
prefixes.emit(sink); | |||
|
|||
let mem_e = match mem_e.clone() { | |||
Amode::RbpOffset { simm32, flags } => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to avoid providing the base reg as an operand to the operand collector, I guess?
I wonder if it might be cleaner, or less error-prone at least, to handle this case in the Amode::get_operands
and Amode::with_allocs
methods instead: skip providing rbp (or rsp) as an operand, and skip pulling an alloc from the AllocationConsumer if the original register was rbp (or rsp).
(I'm actually not sure; I'm going back and forth right now.)
The general concern I had with this is that we may not handle the same lowering everywhere. But actually I think everything funnels to this one lowering helper so maybe this is sufficient.
If we do stick to this approach, let's add a comment where we define the RbpOffset
arm that it must be used for rbp (as opposed to the generic forms) because rbp cannot be given as an arg to regalloc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I actually like this better.
How are these instructions expected to behave in the face of inlining? |
Hi @fitzgen, the For this is to be always correct, we instead have to read the return address from the stack slot where it was saved in the prologue. However, that doesn't happen in leaf functions - unless it is forced via the If so, we can read the return address from the slot at "initial stack pointer + 14*8". This could be done either via Also, I'm a bit confused about why that new (Or, in the alternative, even something like |
We just special case getting operands from `Amode`s now.
We don't have inlining so it is a bit of a hypothetical question at the moment, but these instructions are about physical frames, so I imagine we would want to ignore logical frames that don't have a corresponding physical frame (i.e. they have been inlined) and just return information about the current physical frame regardless. |
It is about getting an unallocatable physical register's value into a virtual register, and avoiding the pinned vregs that we want to remove support for from |
This came out of some discussion that @fitzgen and I had: I would like to push the backends gradually away from using the "pinned vregs", or vregs that represent physical registers directly, and instead use operand constraints wherever possible. This is because handling moves-from-pinned and moves-to-pinned-vregs introduces a bunch of special cases in regalloc, and leads to worse performance (it uses a whole instruction, the move, to communicate what should be just a constraint) and sometimes bugs (e.g. bytecodealliance/regalloc2#60). Just yesterday I was helping to find the cause for a weird So it's a bit forward-looking, but there is a reason! At some point later I (or someone) will go through and systematically update the remaining uses. |
Okay I pushed a couple new commits:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few tidbits on the RbpOffset
-> ImmReg
change (looks good overall though!).
LGTM. I've added a couple of inline comments, but those are really just cosmetic. |
Oh, maybe one more thing comes to mind: it would be great if the affected ISLE patterns actually verified that
Then we'd get a compile-time error instead of random run-time crashes if the assumption was violated. |
I see, thanks for the explanation! |
The CLIF verifier does check this condition, so by the time we lower the CLIF, we know we have valid CLIF, and this check should be unnecessary. I wouldn't mind the redundancy if it were a simple one-line debug assertion, but since it would involve new external ISLE extractors, I don't think it is quite worth it. |
Ah, good! I hadn't see that. |
I didn't see an answer to my question in #4573 (comment). |
@fitzgen replied to your comment in this comment. |
Missed that comment. Thanks @cfallin. |
This is the initial part of #4535
TODO: I need to implement the s390x lowering for these new instructions.