Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT - Folding unwanted foreach with empty list. #61455

Open
ShreyasJejurkar opened this issue Nov 11, 2021 · 6 comments
Open

JIT - Folding unwanted foreach with empty list. #61455

ShreyasJejurkar opened this issue Nov 11, 2021 · 6 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@ShreyasJejurkar
Copy link
Contributor

ShreyasJejurkar commented Nov 11, 2021

I was playing with the .NET 6 JIT compiler with sharplab. Found the following interesting scenario, where we can do optimizations.

using System;
using System.Collections.Generic;

List<string> names = new();

foreach(var item in names)
    Console.WriteLine(item);

In the above scenario, can JIT detect that names list is an empty list and in program execution flow, no element is added to that list.? Although if anyone enumerated that list, nothing will happen and so can't JIT detect this behavior and fold foreach loop to nothing?

Right now, JIT emits assembly code for foreach loop, as we can see below in the link, which I think no use and can be avoided.

https://sharplab.io/#v2:EYLgtghglgdgNAFxBAzmAPgAQEwAYCwAUDgIxGa4AEmJALANxFEAyUKCAPDbgHyUwQwAUxSUAvPyEB3ABQBKRoSIAzAPYAnIRADGACxkA3COspQEQsKZj9BIuUUqPqJAJwyzFhUSA===

@EgorBo @AndyAyersMS

category:cq
theme:basic-cq
skill-level:expert
cost:large
impact:medium

@dotnet-issue-labeler dotnet-issue-labeler bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels Nov 11, 2021
@ghost
Copy link

ghost commented Nov 11, 2021

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

I was playing with the JIT compiler with sharplab. Found the following interesting scenario, where we can do optimizations.

using System;
using System.Collections.Generic;

List<string> names = new();

foreach(var item in names)
    Console.WriteLine(item);

In the above scenario, can JIT detect that names list is an empty list and in program execution flow, no element is added to that list.? Although if anyone enumerated that list, nothing will happen and so can't JIT detect this behavior and fold foreach loop to nothing?

Right now, JIT emits assembly code for foreach loop, as we can see below in the link, which I think no use and can be avoided.

https://sharplab.io/#v2:EYLgtghglgdgNAFxBAzmAPgAQEwAYCwAUDgIxGa4AEmJALANxFEAyUKCAPDbgHyUwQwAUxSUAvPyEB3ABQBKRoSIAzAPYAnIRADGACxkA3COspQEQsKZj9BIuUUqPqJAJwyzFhUSA===

Author: ShreyasJejurkar
Assignees: -
Labels:

area-CodeGen-coreclr, untriaged

Milestone: -

@Tornhoof
Copy link
Contributor

Tornhoof commented Nov 11, 2021

This would only work if your code is side-effect free, this might be the case in your example, but think about a custom type where the enumeration (even an empty one) is not side-effect free, take a look at IEnumerator<T> it has Dispose and Reset, so the design even hints at non side-effect free enumeration. Then you can't skip it anymore.

@EgorBo
Copy link
Member

EgorBo commented Nov 11, 2021

I think we're pretty far from being able to fold this code completely, I am pretty sure it needs a better struct enregistration (List.Enumerator has a complex layout) and most likely escape analysis (to know that List object doesn't escape so we don't have to care about properly initializing it). Andy might have a different opinion.

@EgorBo
Copy link
Member

EgorBo commented Nov 11, 2021

think about a custom type where the enumeration (even an empty one) is not side-effect free, take a look at IEnumerator it has Dispose and Reset, so the design even hints at non side-effect free enumeration. Then you can't skip it anymore.

Right, but in this case it's fine - List<>'s enumerator should be side-effect free, it has empty Dispose

@EgorBo EgorBo added this to the Future milestone Nov 11, 2021
@EgorBo EgorBo removed the untriaged New issue has not been triaged by the area owner label Nov 11, 2021
@AndyAyersMS
Copy link
Member

Escape analysis here requires fieldwise analysis (list is stored into a field of the enumerator).

@hez2010
Copy link
Contributor

hez2010 commented Jul 19, 2024

With https://github.com/hez2010/runtime/tree/field-stackalloc we are able to remove all allocations when the list is a list of non-gc type. We still need the support for gc-type arrays.

using System;
using System.Collections.Generic;

List<int> numbers = new();

foreach(var item in numbers)
    Console.WriteLine(item);

Codegen:

; Assembly listing for method StackAllocationTest.TestClass:Test() (Tier1)
; Emitting BLENDED_CODE for X64 with AVX - Windows
; Tier1 code
; optimized code
; optimized using Dynamic PGO
; rsp based frame
; partially interruptible
; with Dynamic PGO: fgCalledCount is 136
; 2 inlinees with PGO data; 6 single block inlinees; 0 inlinees without PGO data
; Final local variable assignments
;
;* V00 loc0         [V00    ] (  0,  0   )  struct (24) zero-ref    ld-addr-op <System.Collections.Generic.List`1+Enumerator[int]>
;  V01 OutArgs      [V01    ] (  1,  1   )  struct (32) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
;* V02 tmp1         [V02    ] (  0,  0   )    long  ->  zero-ref    class-hnd exact "NewObj constructor temp" <System.Collections.Generic.List`1[int]>
;* V03 tmp2         [V03    ] (  0,  0   )  struct (24) zero-ref    ld-addr-op "NewObj constructor temp" <System.Collections.Generic.List`1+Enumerator[int]>
;* V04 tmp3         [V04    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
;* V05 tmp4         [V05    ] (  0,  0   )     ref  ->  zero-ref    class-hnd "Inline stloc first use temp" <System.Collections.Generic.List`1[int]>
;  V06 tmp5         [V06    ] (  4,  4   )  struct (24) [rsp+0x28]  do-not-enreg[XSF] must-init addr-exposed "stack allocated ref class temp" <System.Collections.Generic.List`1[int]>
;  V07 tmp6         [V07,T00] (  4,  3   )     ref  ->  rsi         single-def "field V00._list (fldOffset=0x0)" P-INDEP
;* V08 tmp7         [V08,T07] (  0,  0   )     int  ->  zero-ref    "field V00._index (fldOffset=0x8)" P-INDEP
;  V09 tmp8         [V09,T03] (  2,  2   )     int  ->  rbx         single-def "field V00._version (fldOffset=0xc)" P-INDEP
;  V10 tmp9         [V10,T10] (  2,  0   )     int  ->  rcx         "field V00._current (fldOffset=0x10)" P-INDEP
;  V11 tmp10        [V11,T02] (  2,  2   )     ref  ->  [rsp+0x20]  do-not-enreg[F] "field V03._list (fldOffset=0x0)" P-INDEP
;* V12 tmp11        [V12,T08] (  0,  0   )     int  ->  zero-ref    single-def "field V03._index (fldOffset=0x8)" P-INDEP
;  V13 tmp12        [V13,T04] (  2,  2   )     int  ->  rbx         single-def "field V03._version (fldOffset=0xc)" P-INDEP
;* V14 tmp13        [V14    ] (  0,  0   )     int  ->  zero-ref    single-def "field V03._current (fldOffset=0x10)" P-INDEP
;  V15 tmp14        [V15,T09] (  3,  0   )     ref  ->  rcx         "arr expr"
;  V16 cse0         [V16,T05] (  2,  2   )     int  ->  rcx         "CSE #01: aggressive"
;  V17 cse1         [V17,T06] (  2,  2   )     int  ->  rcx         "CSE #02: aggressive"
;  V18 rat0         [V18,T01] (  6,  2   )    long  ->  rdi         "Widened IV V08"
;
; Lcl frame size = 64

G_M57129_IG01:  ;; offset=0x0000
       push     rdi
       push     rsi
       push     rbx
       sub      rsp, 64
       vxorps   xmm4, xmm4, xmm4
       vmovdqu  xmmword ptr [rsp+0x28], xmm4
       xor      eax, eax
       mov      qword ptr [rsp+0x38], rax
                                                ;; size=24 bbWeight=1 PerfScore 6.83
G_M57129_IG02:  ;; offset=0x0018
       mov      rcx, 0x7FFEC2B7E4B0      ; System.Collections.Generic.List`1[int]
       mov      qword ptr [rsp+0x28], rcx
       mov      rcx, 0x1C70020E5A8      ; 'System.Int32[]'
       mov      gword ptr [rsp+0x30], rcx
       lea      rcx, [rsp+0x28]
       mov      qword ptr [rsp+0x20], rcx
       mov      ebx, dword ptr [rsp+0x3C]
       mov      rsi, gword ptr [rsp+0x20]
       xor      edi, edi
                                                ;; size=51 bbWeight=1 PerfScore 6.25
G_M57129_IG03:  ;; offset=0x004B
       mov      ecx, dword ptr [rsi+0x14]
       cmp      ebx, ecx
       jne      SHORT G_M57129_IG06
       mov      ecx, dword ptr [rsi+0x10]
       cmp      edi, ecx
       jb       SHORT G_M57129_IG05
                                                ;; size=14 bbWeight=1 PerfScore 6.50
G_M57129_IG04:  ;; offset=0x0059
       add      rsp, 64
       pop      rbx
       pop      rsi
       pop      rdi
       ret
                                                ;; size=8 bbWeight=1 PerfScore 2.75
G_M57129_IG05:  ;; offset=0x0061
       mov      rcx, gword ptr [rsi+0x08]
       cmp      edi, dword ptr [rcx+0x08]
       jae      SHORT G_M57129_IG07
       mov      ecx, dword ptr [rcx+4*rdi+0x10]
       inc      edi
       call     [System.Console:WriteLine(int)]
       jmp      SHORT G_M57129_IG03
                                                ;; size=23 bbWeight=0 PerfScore 0.00
G_M57129_IG06:  ;; offset=0x0078
       call     [System.ThrowHelper:ThrowInvalidOperationException_InvalidOperation_EnumFailedVersion()]
       int3
                                                ;; size=7 bbWeight=0 PerfScore 0.00
G_M57129_IG07:  ;; offset=0x007F
       call     CORINFO_HELP_RNGCHKFAIL
       int3
                                                ;; size=6 bbWeight=0 PerfScore 0.00

; Total bytes of code 133, prolog size 24, PerfScore 22.33, instruction count 39, allocated bytes for code 133 (MethodHash=1b6d20d6) for method StackAllocationTest.TestClass:Test() (Tier1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

No branches or pull requests

5 participants