-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Allow forwarding field accesses off of implicit byrefs #80852
JIT: Allow forwarding field accesses off of implicit byrefs #80852
Conversation
The JIT currently allows forwarding implicit byrefs at their last uses to calls, but only if the full implicit byref is used. This change allows the JIT to forward any such access off of an implicit byref parameter.
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak Issue DetailsThe JIT currently allows forwarding implicit byrefs at their last uses to calls, but only if the full implicit byref is used. This change allows the JIT to forward any such access off of an implicit byref parameter. For example: using System.Runtime.CompilerServices;
class Program
{
public static void Main()
{
Foo(default);
}
[MethodImpl(MethodImplOptions.NoInlining)]
static void Foo(S1 s1)
{
Bar(s1.B);
}
[MethodImpl(MethodImplOptions.NoInlining)]
static void Bar(S2 s)
{
}
private struct S1
{
public int A;
public S2 B;
}
private struct S2
{
public int C, D, E, F;
}
} Codegen before: ; Assembly listing for method Program:Test(Program+S1)
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; Final local variable assignments
;
; V00 arg0 [V00,T00] ( 3, 6 ) byref -> rcx single-def
; V01 OutArgs [V01 ] ( 1, 1 ) lclBlk (32) [rsp+00H] "OutgoingArgSpace"
; V02 tmp1 [V02 ] ( 2, 4 ) struct (16) [rsp+28H] do-not-enreg[XS] addr-exposed "by-value struct argument"
;
; Lcl frame size = 56
G_M4574_IG01: ;; offset=0000H
4883EC38 sub rsp, 56
C5F877 vzeroupper
;; size=7 bbWeight=1 PerfScore 1.25
G_M4574_IG02: ;; offset=0007H
C5F8104104 vmovups xmm0, xmmword ptr [rcx+04H]
C5F811442428 vmovups xmmword ptr [rsp+28H], xmm0
488D4C2428 lea rcx, [rsp+28H]
FF1553A41C00 call [Program:Bar(Program+S2)]
90 nop
;; size=23 bbWeight=1 PerfScore 8.75
G_M4574_IG03: ;; offset=001EH
4883C438 add rsp, 56
C3 ret
;; size=5 bbWeight=1 PerfScore 1.25
; Total bytes of code 35, prolog size 7, PerfScore 14.75, instruction count 9, allocated bytes for code 35 (MethodHash=a8e6ee21) for method Program:Test(Program+S1)
; ============================================================ Codegen after: ; Assembly listing for method Program:Test(Program+S1)
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; Final local variable assignments
;
; V00 arg0 [V00,T00] ( 3, 6 ) byref -> rcx single-def
; V01 OutArgs [V01 ] ( 1, 1 ) lclBlk (32) [rsp+00H] "OutgoingArgSpace"
;
; Lcl frame size = 40
G_M4574_IG01: ;; offset=0000H
4883EC28 sub rsp, 40
;; size=4 bbWeight=1 PerfScore 0.25
G_M4574_IG02: ;; offset=0004H
4883C104 add rcx, 4
FF1532A11C00 call [Program:Bar(Program+S2)]
90 nop
;; size=11 bbWeight=1 PerfScore 3.50
G_M4574_IG03: ;; offset=000FH
4883C428 add rsp, 40
C3 ret
;; size=5 bbWeight=1 PerfScore 1.25
; Total bytes of code 20, prolog size 4, PerfScore 7.00, instruction count 6, allocated bytes for code 20 (MethodHash=a8e6ee21) for method Program:Test(Program+S1)
; ============================================================ (The latter would also be tailcalled without NoInlining attribute)
|
cc @dotnet/jit-contrib PTAL @AndyAyersMS Small set of diffs. The main benefit is that it makes the copy elision work consistently for both implicit byrefs and normal locals. |
Ping @AndyAyersMS |
…0852) The JIT currently allows forwarding implicit byrefs at their last uses to calls, but only if the full implicit byref is used. This change allows the JIT to forward any such access off of an implicit byref parameter.
The JIT currently allows forwarding implicit byrefs at their last uses to calls, but only if the full implicit byref is used. This change allows the JIT to forward any such access off of an implicit byref parameter.
For example:
Codegen before:
Codegen after:
(The latter would also be tailcalled without NoInlining attribute)