Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64] Skip over shadow space for ARM64EC entry thunk variadic calls #80994

Merged
merged 1 commit into from
Feb 27, 2024

Conversation

bylaws
Copy link
Contributor

@bylaws bylaws commented Feb 7, 2024

When in an entry thunk the x64 SP is passed in x4 but this cannot be directly passed through since x64 varargs calls have a 32 byte shadow store at SP followed by the in-stack parameters. ARM64EC varargs calls on the other hand expect x4 to point to the first in-stack parameter.

CC: @efriedma-quic @cjacek @mstorsjo

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 7, 2024

@llvm/pr-subscribers-backend-aarch64

Author: Billy Laws (bylaws)

Changes

When in an entry thunk the x64 SP is passed in x4 but this cannot be directly passed through since x64 varargs calls have a 32 byte shadow store at SP followed by the in-stack parameters. ARM64EC varargs calls on the other hand expect x4 to point to the first in-stack parameter.

CC: @efriedma-quic @cjacek @mstorsjo


Full diff: https://github.com/llvm/llvm-project/pull/80994.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+11-1)
  • (modified) llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll (+1-1)
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 8573939b04389..b7e1c96c23728 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -8010,7 +8010,17 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
 
   if (IsVarArg && Subtarget->isWindowsArm64EC()) {
     SDValue ParamPtr = StackPtr;
-    if (IsTailCall) {
+    if (MF.getFunction().getCallingConv() == CallingConv::ARM64EC_Thunk_X64) {
+      // When in an entry thunk the x64 SP is passed via x4. This cannot
+      // be directly passed through since x64 varargs calls have a 32 byte
+      // shadow store at SP followed by the in-stack parameters,
+      // Arm64EC varargs calls on the other hand expect x4
+      // to point to the first in-stack parameter.
+      Register VReg = MF.addLiveIn(AArch64::X4, &AArch64::GPR64RegClass);
+      SDValue Val = DAG.getCopyFromReg(Chain, DL, VReg, MVT::i64);
+      SDValue PtrOff = DAG.getIntPtrConstant(32, DL);
+      ParamPtr = DAG.getNode(ISD::ADD, DL, PtrVT, Val, PtrOff);
+    } else if (IsTailCall) {
       // Create a dummy object at the top of the stack that can be used to get
       // the SP after the epilogue
       int FI = MF.getFrameInfo().CreateFixedObject(1, FPDiff, true);
diff --git a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
index 0083818def151..bb9ba05f7a272 100644
--- a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
+++ b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
@@ -147,7 +147,7 @@ define void @has_varargs(...) nounwind {
 ; CHECK-NEXT:     add     x29, sp, #160
 ; CHECK-NEXT:     .seh_add_fp     160
 ; CHECK-NEXT:     .seh_endprologue
-; CHECK-NEXT:     mov     x4, sp
+; CHECK-NEXT:     add     x4, x4, #32
 ; CHECK-NEXT:     mov     x5, xzr
 ; CHECK-NEXT:     blr     x9
 ; CHECK-NEXT:     adrp    x8, __os_arm64x_dispatch_ret

@bylaws
Copy link
Contributor Author

bylaws commented Feb 7, 2024

/cherry-pick dff9bb5

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 7, 2024

/cherry-pick dff9bb5

Error: Command failed due to missing milestone.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I attempted to keep the meaning of IR inside the thunks as close as possible to "normal" IR: following the same semantics, just with a slightly exotic calling convention. So someone who doesn't understand the rules here can just ignore the calling convention stuff, and understand what the IR is doing.

Here, you're violating that rule: you're making a vararg function argument passing work the wrong way. Outside of musttail calls, vararg lists don't normally get forwarded from the caller to the callee. The vararg list of a call consists of arguments explicitly passed beyond the end of the fixed argument list.

I see two ways to solve this:

  • We try to add some way to model "vararg list forwarding" to LLVM IR. This doesn't currently exist, so a bit complicated to add.
  • We don't model the caller or the callee as "varargs"; instead, we pass x4 and x5 explicitly. Basically, add "inreg" markings (or something like that) to the arguments representing x4 and x5, then mess with CC_AArch64_Arm64EC_Thunk using CCIfInReg to pass those values in registers instead of memory.

@bylaws
Copy link
Contributor Author

bylaws commented Feb 13, 2024

In general, I attempted to keep the meaning of IR inside the thunks as close as possible to "normal" IR: following the same semantics, just with a slightly exotic calling convention. So someone who doesn't understand the rules here can just ignore the calling convention stuff, and understand what the IR is doing.

Here, you're violating that rule: you're making a vararg function argument passing work the wrong way. Outside of musttail calls, vararg lists don't normally get forwarded from the caller to the callee. The vararg list of a call consists of arguments explicitly passed beyond the end of the fixed argument list.

I see two ways to solve this:

  • We try to add some way to model "vararg list forwarding" to LLVM IR. This doesn't currently exist, so a bit complicated to add.
  • We don't model the caller or the callee as "varargs"; instead, we pass x4 and x5 explicitly. Basically, add "inreg" markings (or something like that) to the arguments representing x4 and x5, then mess with CC_AArch64_Arm64EC_Thunk using CCIfInReg to pass those values in registers instead of memory.

I'm conflicted on this, it would surely be the easiest but I don't think it makes much sense to break the 1-1 mapping between x86 args and entry thunk args, modelling x4 as an argument when it isn't. Using the regular CC to call a varargs CC call also feels quite counter-intuitive. The former seems like a better idea, but modelling both zeroing x5 and offsetting x4 in a generic way seems like it would need something beyond just basic forwarding. Did you have any concrete ideas for handling that?

@efriedma-quic
Copy link
Collaborator

but I don't think it makes much sense to break the 1-1 mapping between x86 args and entry thunk args, modelling x4 as an argument when it isn't. Using the regular CC to call a varargs CC call also feels quite counter-intuitive.

AArch64Arm64ECCallLowering::getThunkArgTypes already completely throws away the type of varargs functions. Doing further rewrites doesn't seem like it's making anything worse.

The former seems like a better idea, but modelling both zeroing x5 and offsetting x4 in a generic way seems like it would need something beyond just basic forwarding. Did you have any concrete ideas for handling that?

I guess you'd make the argument list completely empty (except for sret), mark the call with a "thunk" attribute like we do for musttail varargs forwarding, then check for the "thunk" marking in isel and add the arguments. I don't really like introducing a whole new kind of varargs forwarding just for this one case, though.

@bylaws
Copy link
Contributor Author

bylaws commented Feb 17, 2024

@efriedma-quic Had an attempt at doing it the first way you mentioned, how does this look?

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@efriedma-quic
Copy link
Collaborator

It looks like the regression tests are failing on the buildbots?

@dpaoliello
Copy link
Contributor

It looks like the regression tests are failing on the buildbots?

arm64ec-entry-thunks.ll failed, I guess the baseline needs to be updated?

@bylaws
Copy link
Contributor Author

bylaws commented Feb 23, 2024

@efriedma-quic forgot to push my fix for that after I did it, should be fine now

@dpaoliello
Copy link
Contributor

@bylaws it's passing on Windows, but still failing when run on Linux - weird...

When in an entry thunk the x64 SP is passed in x4 but this cannot be directly
passed through since x64 varargs calls have a 32 byte shadow store at SP
followed by the in-stack parameters. ARM64EC varargs calls on the other hand
expect x4 to point to the first in-stack parameter.
@efriedma-quic efriedma-quic merged commit abc693f into llvm:main Feb 27, 2024
4 checks passed
dpaoliello pushed a commit to dpaoliello/llvm-project that referenced this pull request Feb 27, 2024
…ls (llvm#80994)

When in an entry thunk the x64 SP is passed in x4 but this cannot be
directly passed through since x64 varargs calls have a 32 byte shadow
store at SP followed by the in-stack parameters. ARM64EC varargs calls
on the other hand expect x4 to point to the first in-stack parameter.
dpaoliello pushed a commit to dpaoliello/llvm-project that referenced this pull request Mar 8, 2024
…ls (llvm#80994)

When in an entry thunk the x64 SP is passed in x4 but this cannot be
directly passed through since x64 varargs calls have a 32 byte shadow
store at SP followed by the in-stack parameters. ARM64EC varargs calls
on the other hand expect x4 to point to the first in-stack parameter.
dpaoliello pushed a commit to dpaoliello/llvm-project that referenced this pull request Mar 11, 2024
…ls (llvm#80994)

When in an entry thunk the x64 SP is passed in x4 but this cannot be
directly passed through since x64 varargs calls have a 32 byte shadow
store at SP followed by the in-stack parameters. ARM64EC varargs calls
on the other hand expect x4 to point to the first in-stack parameter.
@pointhex pointhex mentioned this pull request May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants