Lay out loop bodies contiguously #13314

JosephTremoulet · 2017-08-10T14:57:15Z

Rearrange basic blocks during loop identification so that loop bodies
are kept contiguous when possible. Blocks in the lexical range of the
loop which do not participate in the flow cycle (which typically
correspond to code associated with early exits using break or
return) are moved out below the loop when possible without breaking EH
region nesting. The target insertion point, when possible, is chosen to
be the first spot below the loop that will not break up fall-through.

Layout can significantly affect the performance of loops, particularly
small search loops, by avoiding the taken branch on the hot path,
improving the locality of the code fetched while iterating the loop, and
potentially aiding loop stream detection.

Resolves #9692.

JosephTremoulet · 2017-08-10T14:57:27Z

@briansull, @AndyAyersMS PTAL
/cc @dotnet/jit-contrib, @bbowyersmyth, @redknightlois

This change acheives the 30% in the microbenchmark from #9692 (added here to perf suite), as well as the 6% in fasta and 5% in TreeSort identified in #11192, plus 6% in fankuch-redux, 5% in Ackermann, and 1% in Puzzle.

@redknightlois, it would be great to know if this fixes the issues in your code, or if that would require improved loop detection or discontiguous EH regions or something.

Throughput impact negligible (0.07% instrs retired increased in release System.Private.CoreLib crossgen).

Results from jit-diff:

Summary:
(Note: Lower is better)

Total bytes of diff: -6622 (0.00 % of base)
    diff is an improvement.

Total byte diff includes -500 bytes from reconciling methods
        Base had    1 unique methods,      500 unique bytes
        Diff had    0 unique methods,        0 unique bytes

Top file regressions by size (bytes):
         184 : System.Net.Security.dasm (0.13 % of base)
         148 : Microsoft.CodeAnalysis.VisualBasic.dasm (0.01 % of base)
         118 : Microsoft.CSharp.dasm (0.04 % of base)
          47 : System.Linq.dasm (0.03 % of base)
          40 : JIT\Performance\CodeQuality\BenchF\MatInv4\MatInv4\MatInv4.dasm (0.99 % of base)

Top file improvements by size (bytes):
       -1710 : System.Private.CoreLib.dasm (-0.05 % of base)
        -719 : Microsoft.CodeAnalysis.CSharp.dasm (-0.03 % of base)
        -204 : JIT\jit64\opt\cse\staticFieldExpr1_ro_loop\staticFieldExpr1_ro_loop.dasm (-11.45 % of base)
        -187 : System.Private.Uri.dasm (-0.24 % of base)
        -172 : Interop\ArrayMarshalling\ByValArray\MarshalArrayByValTest\MarshalArrayByValTest.dasm (-1.05 % of base)

344 total files with size differences (299 improved, 45 regressed), 6187 unchanged.

Top method regessions by size (bytes):
         296 : System.Private.CoreLib.dasm - Type:GetEnumData(byref,byref):this
         196 : System.Net.Security.dasm - SSPIWrapper:EncryptDecryptHelper(int,ref,ref,ref,int):int
         110 : System.Private.CoreLib.dasm - GenericEqualityComparer`1:IndexOf(ref,ref,int,int):int:this
          89 : System.Private.CoreLib.dasm - EventProvider:WriteEvent(byref,long,long,long,ref):bool:this
          75 : Microsoft.CodeAnalysis.VisualBasic.dasm - SourceModuleSymbol:BindImports(struct):ref:this

Top method improvements by size (bytes):
        -500 : Microsoft.CodeAnalysis.CSharp.dasm - LanguageParser:SkipBadTokensWithExpectedKind(ref,ref,char,byref):int:this
        -204 : JIT\jit64\opt\cse\staticFieldExpr1_ro_loop\staticFieldExpr1_ro_loop.dasm - Test_Main:Main():int
        -183 : System.Private.CoreLib.dasm - Dictionary`2:Remove(ref):bool:this (11 methods)
        -172 : Interop\ArrayMarshalling\ByValArray\MarshalArrayByValTest\MarshalArrayByValTest.dasm - Test:Equals(ref,ref):bool (11 methods)
        -129 : System.Private.CoreLib.dasm - Enum:TryParseEnum(ref,ref,bool,byref):bool

1632 total methods with size differences (1096 improved, 536 regressed), 313322 unchanged.

redknightlois · 2017-08-10T15:08:54Z

@JosephTremoulet I have the right benchmark for those :) ... We will be moving the codebase to 2.0 by next week (hopefully) so I could be able to try this. Loop code flow has given us massive improvements, so even if you dont catch all versions it should be a net win performance wise for tight code.

stephentoub · 2017-08-10T15:28:22Z

@JosephTremoulet, if this fixes the problem, it'd be great to also undo the workarounds employed in a variety of places, e.g.

coreclr/src/mscorlib/src/System/String.Comparison.cs

Line 47 in 57214f1

    
           goto ReturnCharAMinusCharB; // TODO: Workaround for https://github.com/dotnet/coreclr/issues/9692

https://github.com/dotnet/corefx/blob/b769a73ecf278f92993a8b789aaaaedbc55a7dc3/src/System.Net.Http/src/System/Net/Http/Headers/HeaderUtilities.cs#L334
https://github.com/dotnet/corefx/blob/cab2165f92380e8424589909158a26ce79e18683/src/System.Memory/src/System/SpanHelpers.byte.cs#L158
(I think there are also some cases where a comment wasn't left, so it might be worth searching for gotos.)

BruceForstall · 2017-08-10T16:14:23Z

@JosephTremoulet this change calls for full desktop testing plus tagging dotnet-bot to run a bunch of stress modes.

JosephTremoulet · 2017-08-10T18:22:23Z

@stephentoub sounds good, I'll verify and revert as the compiler change propagates

JosephTremoulet · 2017-08-10T18:22:36Z

@dotnet-bot help

dotnet-bot · 2017-08-10T18:22:46Z

Welcome to the dotnet/coreclr Repository

The following is a list of valid commands on this PR. To invoke a command, comment the indicated phrase on the PR

The following commands are valid for all PRs and repositories.

Click to expand

Comment Phrase	Action
@dotnet-bot test this please	Re-run all legs. Use sparingly
@dotnet-bot test ci please	Generates (but does not run) jobs based on changes to the groovy job definitions in this branch
@dotnet-bot help	Print this help message

The following jobs are launched by default for each PR against dotnet/coreclr:master.

Click to expand

The following optional jobs are available in PRs against dotnet/coreclr:master.

Click to expand

Comment Phrase	Job Launched
@dotnet-bot test Ubuntu arm64 Checked	Queues Ubuntu arm64 Checked
@dotnet-bot test Ubuntu arm64 Checked pri1r2r	Queues Ubuntu arm64 Cross Checked pri1r2r Build and Test
@dotnet-bot test Ubuntu arm64 Checked	Queues Ubuntu arm64 Cross Checked Build and Test
@dotnet-bot test Windows_NT arm64 Checked pri1r2r	Queues Windows_NT arm64 Cross Checked pri1r2r Build and Test
@dotnet-bot test Windows_NT arm64 Checked	Queues Windows_NT arm64 Cross Checked Build and Test
@dotnet-bot test Windows_NT arm64 Release pri1r2r	Queues Windows_NT arm64 Cross Release pri1r2r Build and Test
@dotnet-bot test Windows_NT arm64 Release	Queues Windows_NT arm64 Cross Release Build and Test
@dotnet-bot test Ubuntu arm64 Debug	Queues Ubuntu arm64 Debug
@dotnet-bot test Ubuntu arm64 Release	Queues Ubuntu arm64 Release
@dotnet-bot test Ubuntu arm64 Release pri1r2r	Queues Ubuntu arm64 Cross Release pri1r2r Build and Test
@dotnet-bot test Ubuntu arm64 Release	Queues Ubuntu arm64 Cross Release Build and Test
@dotnet-bot test Ubuntu16.04 arm Cross Checked Build	Queues Ubuntu16.04 arm Cross Checked Build
@dotnet-bot test Ubuntu arm Cross Checked Build	Queues Ubuntu arm Cross Checked Build
@dotnet-bot test Windows_NT arm Checked pri1r2r	Queues Windows_NT arm Cross Checked pri1r2r Build and Test
@dotnet-bot test Ubuntu arm Cross Debug Build	Queues Ubuntu arm Cross Debug Build
@dotnet-bot test Windows_NT arm Debug	Queues Windows_NT arm Cross Debug Build
@dotnet-bot test Ubuntu16.04 arm Cross Release Build	Queues Ubuntu16.04 arm Cross Release Build
@dotnet-bot test Windows_NT arm Release pri1r2r	Queues Windows_NT arm Cross Release pri1r2r Build and Test
@dotnet-bot test Windows_NT arm Release	Queues Windows_NT arm Cross Release Build and Test
@dotnet-bot test Tizen armel Cross Checked Build	Queues Tizen armel Cross Checked Build
@dotnet-bot test Debian8.4	Queues Debian8.4 x64 Checked Build
@dotnet-bot test Fedora24	Queues Fedora24 x64 Checked Build
@dotnet-bot test OpenSUSE42.1	Queues OpenSUSE42.1 x64 Checked Build
@dotnet-bot test RHEL7.2	Queues RHEL7.2 x64 Checked Build
@dotnet-bot test Ubuntu16.04 x64	Queues Ubuntu16.04 x64 Checked Build
@dotnet-bot test Ubuntu16.10	Queues Ubuntu16.10 x64 Checked Build
@dotnet-bot test Debian8.4	Queues Debian8.4 x64 Debug Build
@dotnet-bot test Fedora24	Queues Fedora24 x64 Debug Build
@dotnet-bot test OpenSUSE42.1	Queues OpenSUSE42.1 x64 Debug Build
@dotnet-bot test RHEL7.2	Queues RHEL7.2 x64 Debug Build
@dotnet-bot test Ubuntu16.04 x64	Queues Ubuntu16.04 x64 Debug Build
@dotnet-bot test Ubuntu16.10	Queues Ubuntu16.10 x64 Debug Build
@dotnet-bot test Ubuntu x64 Checked illink	Queues Ubuntu x64 Checked via ILLink
@dotnet-bot test Ubuntu x64 Checked illink	Queues Ubuntu x64 Checked via ILLink
@dotnet-bot test Windows_NT x64 Checked illink	Queues Windows_NT x64 Checked via ILLink
@dotnet-bot test Ubuntu x64 Debug illink	Queues Ubuntu x64 Debug via ILLink
@dotnet-bot test Ubuntu x64 Debug illink	Queues Ubuntu x64 Debug via ILLink
@dotnet-bot test Windows_NT x64 Debug illink	Queues Windows_NT x64 Debug via ILLink
@dotnet-bot test Ubuntu x64 Release illink	Queues Ubuntu x64 Release via ILLink
@dotnet-bot test Ubuntu x64 Release illink	Queues Ubuntu x64 Release via ILLink
@dotnet-bot test Windows_NT x64 Release illink	Queues Windows_NT x64 Release via ILLink
@dotnet-bot test Windows_NT x86 Checked illink	Queues Windows_NT x86 Checked via ILLink
@dotnet-bot test Windows_NT x86 Debug illink	Queues Windows_NT x86 Debug via ILLink
@dotnet-bot test Windows_NT x86 Release illink	Queues Windows_NT x86 Release via ILLink
@dotnet-bot test Ubuntu arm64 Checked gcstress0x3	Queues Ubuntu arm64 Cross Checked gcstress0x3 Build and Test
@dotnet-bot test Ubuntu arm64 Checked gcstress0xc	Queues Ubuntu arm64 Cross Checked gcstress0xc Build and Test
@dotnet-bot test Windows_NT arm64 Checked gcstress0x3	Queues Windows_NT arm64 Cross Checked gcstress0x3 Build and Test
@dotnet-bot test Windows_NT arm64 Checked gcstress0xc	Queues Windows_NT arm64 Cross Checked gcstress0xc Build and Test
@dotnet-bot test Windows_NT arm Checked gcstress0x3	Queues Windows_NT arm Cross Checked gcstress0x3 Build and Test
@dotnet-bot test Windows_NT arm Checked gcstress0xc	Queues Windows_NT arm Cross Checked gcstress0xc Build and Test
@dotnet-bot test CentOS7.1 forcerelocs	Queues CentOS7.1 x64 Checked Build and Test (Jit - ForceRelocs=1)
@dotnet-bot test CentOS7.1 gcstress0x3	Queues CentOS7.1 x64 Checked Build and Test (Jit - GCStress=0x3)
@dotnet-bot test CentOS7.1 gcstress0xc	Queues CentOS7.1 x64 Checked Build and Test (Jit - GCStress=0xC)
@dotnet-bot test CentOS7.1 gcstress0xc_jitstress1	Queues CentOS7.1 x64 Checked Build and Test (Jit - GCStress=0xC JitStress=1)
@dotnet-bot test CentOS7.1 gcstress0xc_jitstress2	Queues CentOS7.1 x64 Checked Build and Test (Jit - GCStress=0xC JitStress=2)
@dotnet-bot test CentOS7.1 gcstress0xc_minopts_heapverify1	Queues CentOS7.1 x64 Checked Build and Test (Jit - GCStress=0xC JITMinOpts=1 HeapVerify=1)
@dotnet-bot test CentOS7.1 gcstress0xc_zapdisable	Queues CentOS7.1 x64 Checked Build and Test (Jit - GCStress=0xC ZapDisable=1 ReadyToRun=0)
@dotnet-bot test CentOS7.1 gcstress0xc_zapdisable_heapverify1	Queues CentOS7.1 x64 Checked Build and Test (Jit - GCStress=0xC ZapDisable=1 ReadyToRun=0 HeapVerify=1)
@dotnet-bot test CentOS7.1 gcstress0xc_zapdisable_jitstress2	Queues CentOS7.1 x64 Checked Build and Test (Jit - GCStress=0xC ZapDisable=1 ReadyToRun=0 JitStress=2)
@dotnet-bot test CentOS7.1 heapverify1	Queues CentOS7.1 x64 Checked Build and Test (Jit - HeapVerify=1)
@dotnet-bot test CentOS7.1 jitsse2only	Queues CentOS7.1 x64 Checked Build and Test (Jit - EnableAVX=0 EnableSSE3_4=0)
@dotnet-bot test CentOS7.1 jitstress1	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStress=1)
@dotnet-bot test CentOS7.1 jitstress2	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStress=2)
@dotnet-bot test CentOS7.1 jitstress2_jitstressregs0x1000	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=0x1000)
@dotnet-bot test CentOS7.1 jitstress2_jitstressregs0x10	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=0x10)
@dotnet-bot test CentOS7.1 jitstress2_jitstressregs0x80	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=0x80)
@dotnet-bot test CentOS7.1 jitstress2_jitstressregs1	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=1)
@dotnet-bot test CentOS7.1 jitstress2_jitstressregs2	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=2)
@dotnet-bot test CentOS7.1 jitstress2_jitstressregs3	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=3)
@dotnet-bot test CentOS7.1 jitstress2_jitstressregs4	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=4)
@dotnet-bot test CentOS7.1 jitstress2_jitstressregs8	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=8)
@dotnet-bot test CentOS7.1 jitstressregs0x1000	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStressRegs=0x1000)
@dotnet-bot test CentOS7.1 jitstressregs0x10	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStressRegs=0x10)
@dotnet-bot test CentOS7.1 jitstressregs0x80	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStressRegs=0x80)
@dotnet-bot test CentOS7.1 jitstressregs1	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStressRegs=1)
@dotnet-bot test CentOS7.1 jitstressregs2	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStressRegs=2)
@dotnet-bot test CentOS7.1 jitstressregs3	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStressRegs=3)
@dotnet-bot test CentOS7.1 jitstressregs4	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStressRegs=4)
@dotnet-bot test CentOS7.1 jitstressregs8	Queues CentOS7.1 x64 Checked Build and Test (Jit - JitStressRegs=8)
@dotnet-bot test CentOS7.1 minopts	Queues CentOS7.1 x64 Checked Build and Test (Jit - JITMinOpts=1)
@dotnet-bot test CentOS7.1 Checked r2r_jitforcerelocs	Queues CentOS7.1 x64 Checked jitforcerelocs R2R Build & Test
@dotnet-bot test CentOS7.1 Checked r2r_jitminopts	Queues CentOS7.1 x64 Checked jitminopts R2R Build & Test
@dotnet-bot test CentOS7.1 Checked r2r_jitstress1	Queues CentOS7.1 x64 Checked jitstress1 R2R Build & Test
@dotnet-bot test CentOS7.1 Checked r2r_jitstress2	Queues CentOS7.1 x64 Checked jitstress2 R2R Build & Test
@dotnet-bot test CentOS7.1 Checked r2r_jitstressregs0x1000	Queues CentOS7.1 x64 Checked jitstressregs0x1000 R2R Build & Test
@dotnet-bot test CentOS7.1 Checked r2r_jitstressregs0x10	Queues CentOS7.1 x64 Checked jitstressregs0x10 R2R Build & Test
@dotnet-bot test CentOS7.1 Checked r2r_jitstressregs0x80	Queues CentOS7.1 x64 Checked jitstressregs0x80 R2R Build & Test
@dotnet-bot test CentOS7.1 Checked r2r_jitstressregs1	Queues CentOS7.1 x64 Checked jitstressregs1 R2R Build & Test
@dotnet-bot test CentOS7.1 Checked r2r_jitstressregs2	Queues CentOS7.1 x64 Checked jitstressregs2 R2R Build & Test
@dotnet-bot test CentOS7.1 Checked r2r_jitstressregs3	Queues CentOS7.1 x64 Checked jitstressregs3 R2R Build & Test
@dotnet-bot test CentOS7.1 Checked r2r_jitstressregs4	Queues CentOS7.1 x64 Checked jitstressregs4 R2R Build & Test
@dotnet-bot test CentOS7.1 Checked r2r_jitstressregs8	Queues CentOS7.1 x64 Checked jitstressregs8 R2R Build & Test
@dotnet-bot test CentOS7.1 tailcallstress	Queues CentOS7.1 x64 Checked Build and Test (Jit - TailcallStress=1)
@dotnet-bot test CentOS7.1 zapdisable	Queues CentOS7.1 x64 Checked Build and Test (Jit - ZapDisable=1 ReadyToRun=0)
@dotnet-bot test OSX10.12 forcerelocs	Queues OSX10.12 x64 Checked Build and Test (Jit - ForceRelocs=1)
@dotnet-bot test OSX10.12 gcstress0x3	Queues OSX10.12 x64 Checked Build and Test (Jit - GCStress=0x3)
@dotnet-bot test OSX10.12 gcstress0xc	Queues OSX10.12 x64 Checked Build and Test (Jit - GCStress=0xC)
@dotnet-bot test OSX10.12 gcstress0xc_jitstress1	Queues OSX10.12 x64 Checked Build and Test (Jit - GCStress=0xC JitStress=1)
@dotnet-bot test OSX10.12 gcstress0xc_jitstress2	Queues OSX10.12 x64 Checked Build and Test (Jit - GCStress=0xC JitStress=2)
@dotnet-bot test OSX10.12 gcstress0xc_minopts_heapverify1	Queues OSX10.12 x64 Checked Build and Test (Jit - GCStress=0xC JITMinOpts=1 HeapVerify=1)
@dotnet-bot test OSX10.12 gcstress0xc_zapdisable	Queues OSX10.12 x64 Checked Build and Test (Jit - GCStress=0xC ZapDisable=1 ReadyToRun=0)
@dotnet-bot test OSX10.12 gcstress0xc_zapdisable_heapverify1	Queues OSX10.12 x64 Checked Build and Test (Jit - GCStress=0xC ZapDisable=1 ReadyToRun=0 HeapVerify=1)
@dotnet-bot test OSX10.12 gcstress0xc_zapdisable_jitstress2	Queues OSX10.12 x64 Checked Build and Test (Jit - GCStress=0xC ZapDisable=1 ReadyToRun=0 JitStress=2)
@dotnet-bot test OSX10.12 heapverify1	Queues OSX10.12 x64 Checked Build and Test (Jit - HeapVerify=1)
@dotnet-bot test OSX10.12 jitsse2only	Queues OSX10.12 x64 Checked Build and Test (Jit - EnableAVX=0 EnableSSE3_4=0)
@dotnet-bot test OSX10.12 jitstress1	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStress=1)
@dotnet-bot test OSX10.12 jitstress2	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStress=2)
@dotnet-bot test OSX10.12 jitstress2_jitstressregs0x1000	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=0x1000)
@dotnet-bot test OSX10.12 jitstress2_jitstressregs0x10	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=0x10)
@dotnet-bot test OSX10.12 jitstress2_jitstressregs0x80	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=0x80)
@dotnet-bot test OSX10.12 jitstress2_jitstressregs1	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=1)
@dotnet-bot test OSX10.12 jitstress2_jitstressregs2	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=2)
@dotnet-bot test OSX10.12 jitstress2_jitstressregs3	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=3)
@dotnet-bot test OSX10.12 jitstress2_jitstressregs4	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=4)
@dotnet-bot test OSX10.12 jitstress2_jitstressregs8	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=8)
@dotnet-bot test OSX10.12 jitstressregs0x1000	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStressRegs=0x1000)
@dotnet-bot test OSX10.12 jitstressregs0x10	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStressRegs=0x10)
@dotnet-bot test OSX10.12 jitstressregs0x80	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStressRegs=0x80)
@dotnet-bot test OSX10.12 jitstressregs1	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStressRegs=1)
@dotnet-bot test OSX10.12 jitstressregs2	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStressRegs=2)
@dotnet-bot test OSX10.12 jitstressregs3	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStressRegs=3)
@dotnet-bot test OSX10.12 jitstressregs4	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStressRegs=4)
@dotnet-bot test OSX10.12 jitstressregs8	Queues OSX10.12 x64 Checked Build and Test (Jit - JitStressRegs=8)
@dotnet-bot test OSX10.12 minopts	Queues OSX10.12 x64 Checked Build and Test (Jit - JITMinOpts=1)
@dotnet-bot test OSX10.12 Checked r2r_jitforcerelocs	Queues OSX10.12 x64 Checked jitforcerelocs R2R Build & Test
@dotnet-bot test OSX10.12 Checked r2r_jitminopts	Queues OSX10.12 x64 Checked jitminopts R2R Build & Test
@dotnet-bot test OSX10.12 Checked r2r_jitstress1	Queues OSX10.12 x64 Checked jitstress1 R2R Build & Test
@dotnet-bot test OSX10.12 Checked r2r_jitstress2	Queues OSX10.12 x64 Checked jitstress2 R2R Build & Test
@dotnet-bot test OSX10.12 Checked r2r_jitstressregs0x1000	Queues OSX10.12 x64 Checked jitstressregs0x1000 R2R Build & Test
@dotnet-bot test OSX10.12 Checked r2r_jitstressregs0x10	Queues OSX10.12 x64 Checked jitstressregs0x10 R2R Build & Test
@dotnet-bot test OSX10.12 Checked r2r_jitstressregs0x80	Queues OSX10.12 x64 Checked jitstressregs0x80 R2R Build & Test
@dotnet-bot test OSX10.12 Checked r2r_jitstressregs1	Queues OSX10.12 x64 Checked jitstressregs1 R2R Build & Test
@dotnet-bot test OSX10.12 Checked r2r_jitstressregs2	Queues OSX10.12 x64 Checked jitstressregs2 R2R Build & Test
@dotnet-bot test OSX10.12 Checked r2r_jitstressregs3	Queues OSX10.12 x64 Checked jitstressregs3 R2R Build & Test
@dotnet-bot test OSX10.12 Checked r2r_jitstressregs4	Queues OSX10.12 x64 Checked jitstressregs4 R2R Build & Test
@dotnet-bot test OSX10.12 Checked r2r_jitstressregs8	Queues OSX10.12 x64 Checked jitstressregs8 R2R Build & Test
@dotnet-bot test OSX10.12 tailcallstress	Queues OSX10.12 x64 Checked Build and Test (Jit - TailcallStress=1)
@dotnet-bot test OSX10.12 zapdisable	Queues OSX10.12 x64 Checked Build and Test (Jit - ZapDisable=1 ReadyToRun=0)
@dotnet-bot test Ubuntu x64 corefx_baseline	Queues Ubuntu x64 Checked Build and Test (Jit - CoreFx)
@dotnet-bot test Ubuntu x64 corefx_jitstress1	Queues Ubuntu x64 Checked Build and Test (Jit - CoreFx JitStress=1)
@dotnet-bot test Ubuntu x64 corefx_jitstress2	Queues Ubuntu x64 Checked Build and Test (Jit - CoreFx JitStress=2)
@dotnet-bot test Ubuntu x64 corefx_jitstressregs0x1000	Queues Ubuntu x64 Checked Build and Test (Jit - CoreFx JitStressRegs=0x1000)
@dotnet-bot test Ubuntu x64 corefx_jitstressregs0x10	Queues Ubuntu x64 Checked Build and Test (Jit - CoreFx JitStressRegs=0x10)
@dotnet-bot test Ubuntu x64 corefx_jitstressregs0x80	Queues Ubuntu x64 Checked Build and Test (Jit - CoreFx JitStressRegs=0x80)
@dotnet-bot test Ubuntu x64 corefx_jitstressregs1	Queues Ubuntu x64 Checked Build and Test (Jit - CoreFx JitStressRegs=1)
@dotnet-bot test Ubuntu x64 corefx_jitstressregs2	Queues Ubuntu x64 Checked Build and Test (Jit - CoreFx JitStressRegs=2)
@dotnet-bot test Ubuntu x64 corefx_jitstressregs3	Queues Ubuntu x64 Checked Build and Test (Jit - CoreFx JitStressRegs=3)
@dotnet-bot test Ubuntu x64 corefx_jitstressregs4	Queues Ubuntu x64 Checked Build and Test (Jit - CoreFx JitStressRegs=4)
@dotnet-bot test Ubuntu x64 corefx_jitstressregs8	Queues Ubuntu x64 Checked Build and Test (Jit - CoreFx JitStressRegs=8)
@dotnet-bot test Ubuntu x64 corefx_minopts	Queues Ubuntu x64 Checked Build and Test (Jit - CoreFx JITMinOpts=1)
@dotnet-bot test Ubuntu forcerelocs	Queues Ubuntu x64 Checked Build and Test (Jit - ForceRelocs=1)
@dotnet-bot test Ubuntu gcstress0x3	Queues Ubuntu x64 Checked Build and Test (Jit - GCStress=0x3)
@dotnet-bot test Ubuntu gcstress0xc	Queues Ubuntu x64 Checked Build and Test (Jit - GCStress=0xC)
@dotnet-bot test Ubuntu gcstress0xc_jitstress1	Queues Ubuntu x64 Checked Build and Test (Jit - GCStress=0xC JitStress=1)
@dotnet-bot test Ubuntu gcstress0xc_jitstress2	Queues Ubuntu x64 Checked Build and Test (Jit - GCStress=0xC JitStress=2)
@dotnet-bot test Ubuntu gcstress0xc_minopts_heapverify1	Queues Ubuntu x64 Checked Build and Test (Jit - GCStress=0xC JITMinOpts=1 HeapVerify=1)
@dotnet-bot test Ubuntu gcstress0xc_zapdisable	Queues Ubuntu x64 Checked Build and Test (Jit - GCStress=0xC ZapDisable=1 ReadyToRun=0)
@dotnet-bot test Ubuntu gcstress0xc_zapdisable_heapverify1	Queues Ubuntu x64 Checked Build and Test (Jit - GCStress=0xC ZapDisable=1 ReadyToRun=0 HeapVerify=1)
@dotnet-bot test Ubuntu gcstress0xc_zapdisable_jitstress2	Queues Ubuntu x64 Checked Build and Test (Jit - GCStress=0xC ZapDisable=1 ReadyToRun=0 JitStress=2)
@dotnet-bot test Ubuntu heapverify1	Queues Ubuntu x64 Checked Build and Test (Jit - HeapVerify=1)
@dotnet-bot test Ubuntu jitsse2only	Queues Ubuntu x64 Checked Build and Test (Jit - EnableAVX=0 EnableSSE3_4=0)
@dotnet-bot test Ubuntu jitstress1	Queues Ubuntu x64 Checked Build and Test (Jit - JitStress=1)
@dotnet-bot test Ubuntu jitstress2	Queues Ubuntu x64 Checked Build and Test (Jit - JitStress=2)
@dotnet-bot test Ubuntu jitstress2_jitstressregs0x1000	Queues Ubuntu x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=0x1000)
@dotnet-bot test Ubuntu jitstress2_jitstressregs0x10	Queues Ubuntu x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=0x10)
@dotnet-bot test Ubuntu jitstress2_jitstressregs0x80	Queues Ubuntu x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=0x80)
@dotnet-bot test Ubuntu jitstress2_jitstressregs1	Queues Ubuntu x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=1)
@dotnet-bot test Ubuntu jitstress2_jitstressregs2	Queues Ubuntu x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=2)
@dotnet-bot test Ubuntu jitstress2_jitstressregs3	Queues Ubuntu x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=3)
@dotnet-bot test Ubuntu jitstress2_jitstressregs4	Queues Ubuntu x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=4)
@dotnet-bot test Ubuntu jitstress2_jitstressregs8	Queues Ubuntu x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=8)
@dotnet-bot test Ubuntu jitstressregs0x1000	Queues Ubuntu x64 Checked Build and Test (Jit - JitStressRegs=0x1000)
@dotnet-bot test Ubuntu jitstressregs0x10	Queues Ubuntu x64 Checked Build and Test (Jit - JitStressRegs=0x10)
@dotnet-bot test Ubuntu jitstressregs0x80	Queues Ubuntu x64 Checked Build and Test (Jit - JitStressRegs=0x80)
@dotnet-bot test Ubuntu jitstressregs1	Queues Ubuntu x64 Checked Build and Test (Jit - JitStressRegs=1)
@dotnet-bot test Ubuntu jitstressregs2	Queues Ubuntu x64 Checked Build and Test (Jit - JitStressRegs=2)
@dotnet-bot test Ubuntu jitstressregs3	Queues Ubuntu x64 Checked Build and Test (Jit - JitStressRegs=3)
@dotnet-bot test Ubuntu jitstressregs4	Queues Ubuntu x64 Checked Build and Test (Jit - JitStressRegs=4)
@dotnet-bot test Ubuntu jitstressregs8	Queues Ubuntu x64 Checked Build and Test (Jit - JitStressRegs=8)
@dotnet-bot test Ubuntu minopts	Queues Ubuntu x64 Checked Build and Test (Jit - JITMinOpts=1)
@dotnet-bot test Ubuntu Checked r2r_jitforcerelocs	Queues Ubuntu x64 Checked jitforcerelocs R2R Build & Test
@dotnet-bot test Ubuntu Checked r2r_jitminopts	Queues Ubuntu x64 Checked jitminopts R2R Build & Test
@dotnet-bot test Ubuntu Checked r2r_jitstress1	Queues Ubuntu x64 Checked jitstress1 R2R Build & Test
@dotnet-bot test Ubuntu Checked r2r_jitstress2	Queues Ubuntu x64 Checked jitstress2 R2R Build & Test
@dotnet-bot test Ubuntu Checked r2r_jitstressregs0x1000	Queues Ubuntu x64 Checked jitstressregs0x1000 R2R Build & Test
@dotnet-bot test Ubuntu Checked r2r_jitstressregs0x10	Queues Ubuntu x64 Checked jitstressregs0x10 R2R Build & Test
@dotnet-bot test Ubuntu Checked r2r_jitstressregs0x80	Queues Ubuntu x64 Checked jitstressregs0x80 R2R Build & Test
@dotnet-bot test Ubuntu Checked r2r_jitstressregs1	Queues Ubuntu x64 Checked jitstressregs1 R2R Build & Test
@dotnet-bot test Ubuntu Checked r2r_jitstressregs2	Queues Ubuntu x64 Checked jitstressregs2 R2R Build & Test
@dotnet-bot test Ubuntu Checked r2r_jitstressregs3	Queues Ubuntu x64 Checked jitstressregs3 R2R Build & Test
@dotnet-bot test Ubuntu Checked r2r_jitstressregs4	Queues Ubuntu x64 Checked jitstressregs4 R2R Build & Test
@dotnet-bot test Ubuntu Checked r2r_jitstressregs8	Queues Ubuntu x64 Checked jitstressregs8 R2R Build & Test
@dotnet-bot test Ubuntu tailcallstress	Queues Ubuntu x64 Checked Build and Test (Jit - TailcallStress=1)
@dotnet-bot test Ubuntu zapdisable	Queues Ubuntu x64 Checked Build and Test (Jit - ZapDisable=1 ReadyToRun=0)
@dotnet-bot test Windows_NT x64 corefx_baseline	Queues Windows_NT x64 Checked Build and Test (Jit - CoreFx)
@dotnet-bot test Windows_NT x64 corefx_jitstress1	Queues Windows_NT x64 Checked Build and Test (Jit - CoreFx JitStress=1)
@dotnet-bot test Windows_NT x64 corefx_jitstress2	Queues Windows_NT x64 Checked Build and Test (Jit - CoreFx JitStress=2)
@dotnet-bot test Windows_NT x64 corefx_jitstressregs0x1000	Queues Windows_NT x64 Checked Build and Test (Jit - CoreFx JitStressRegs=0x1000)
@dotnet-bot test Windows_NT x64 corefx_jitstressregs0x10	Queues Windows_NT x64 Checked Build and Test (Jit - CoreFx JitStressRegs=0x10)
@dotnet-bot test Windows_NT x64 corefx_jitstressregs0x80	Queues Windows_NT x64 Checked Build and Test (Jit - CoreFx JitStressRegs=0x80)
@dotnet-bot test Windows_NT x64 corefx_jitstressregs1	Queues Windows_NT x64 Checked Build and Test (Jit - CoreFx JitStressRegs=1)
@dotnet-bot test Windows_NT x64 corefx_jitstressregs2	Queues Windows_NT x64 Checked Build and Test (Jit - CoreFx JitStressRegs=2)
@dotnet-bot test Windows_NT x64 corefx_jitstressregs3	Queues Windows_NT x64 Checked Build and Test (Jit - CoreFx JitStressRegs=3)
@dotnet-bot test Windows_NT x64 corefx_jitstressregs4	Queues Windows_NT x64 Checked Build and Test (Jit - CoreFx JitStressRegs=4)
@dotnet-bot test Windows_NT x64 corefx_jitstressregs8	Queues Windows_NT x64 Checked Build and Test (Jit - CoreFx JitStressRegs=8)
@dotnet-bot test Windows_NT x64 corefx_minopts	Queues Windows_NT x64 Checked Build and Test (Jit - CoreFx JITMinOpts=1)
@dotnet-bot test Windows_NT forcerelocs	Queues Windows_NT x64 Checked Build and Test (Jit - ForceRelocs=1)
@dotnet-bot test Windows_NT gcstress0x3	Queues Windows_NT x64 Checked Build and Test (Jit - GCStress=0x3)
@dotnet-bot test Windows_NT gcstress0xc_jitstress1	Queues Windows_NT x64 Checked Build and Test (Jit - GCStress=0xC JitStress=1)
@dotnet-bot test Windows_NT gcstress0xc_jitstress2	Queues Windows_NT x64 Checked Build and Test (Jit - GCStress=0xC JitStress=2)
@dotnet-bot test Windows_NT gcstress0xc_minopts_heapverify1	Queues Windows_NT x64 Checked Build and Test (Jit - GCStress=0xC JITMinOpts=1 HeapVerify=1)
@dotnet-bot test Windows_NT gcstress0xc	Queues Windows_NT x64 Checked Build and Test (Jit - GCStress=0xC)
@dotnet-bot test Windows_NT gcstress0xc_zapdisable_heapverify1	Queues Windows_NT x64 Checked Build and Test (Jit - GCStress=0xC ZapDisable=1 ReadyToRun=0 HeapVerify=1)
@dotnet-bot test Windows_NT gcstress0xc_zapdisable_jitstress2	Queues Windows_NT x64 Checked Build and Test (Jit - GCStress=0xC ZapDisable=1 ReadyToRun=0 JitStress=2)
@dotnet-bot test Windows_NT gcstress0xc_zapdisable	Queues Windows_NT x64 Checked Build and Test (Jit - GCStress=0xC ZapDisable=1 ReadyToRun=0)
@dotnet-bot test Windows_NT heapverify1	Queues Windows_NT x64 Checked Build and Test (Jit - HeapVerify=1)
@dotnet-bot test Windows_NT jitsse2only	Queues Windows_NT x64 Checked Build and Test (Jit - EnableAVX=0 EnableSSE3_4=0)
@dotnet-bot test Windows_NT jitstress1	Queues Windows_NT x64 Checked Build and Test (Jit - JitStress=1)
@dotnet-bot test Windows_NT jitstress2_jitstressregs0x1000	Queues Windows_NT x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=0x1000)
@dotnet-bot test Windows_NT jitstress2_jitstressregs0x10	Queues Windows_NT x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=0x10)
@dotnet-bot test Windows_NT jitstress2_jitstressregs0x80	Queues Windows_NT x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=0x80)
@dotnet-bot test Windows_NT jitstress2_jitstressregs1	Queues Windows_NT x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=1)
@dotnet-bot test Windows_NT jitstress2_jitstressregs2	Queues Windows_NT x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=2)
@dotnet-bot test Windows_NT jitstress2_jitstressregs3	Queues Windows_NT x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=3)
@dotnet-bot test Windows_NT jitstress2_jitstressregs4	Queues Windows_NT x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=4)
@dotnet-bot test Windows_NT jitstress2_jitstressregs8	Queues Windows_NT x64 Checked Build and Test (Jit - JitStress=2 JitStressRegs=8)
@dotnet-bot test Windows_NT jitstress2	Queues Windows_NT x64 Checked Build and Test (Jit - JitStress=2)
@dotnet-bot test Windows_NT jitstressregs0x1000	Queues Windows_NT x64 Checked Build and Test (Jit - JitStressRegs=0x1000)
@dotnet-bot test Windows_NT jitstressregs0x10	Queues Windows_NT x64 Checked Build and Test (Jit - JitStressRegs=0x10)
@dotnet-bot test Windows_NT jitstressregs0x80	Queues Windows_NT x64 Checked Build and Test (Jit - JitStressRegs=0x80)
@dotnet-bot test Windows_NT jitstressregs1	Queues Windows_NT x64 Checked Build and Test (Jit - JitStressRegs=1)
@dotnet-bot test Windows_NT jitstressregs2	Queues Windows_NT x64 Checked Build and Test (Jit - JitStressRegs=2)
@dotnet-bot test Windows_NT jitstressregs3	Queues Windows_NT x64 Checked Build and Test (Jit - JitStressRegs=3)
@dotnet-bot test Windows_NT jitstressregs4	Queues Windows_NT x64 Checked Build and Test (Jit - JitStressRegs=4)
@dotnet-bot test Windows_NT jitstressregs8	Queues Windows_NT x64 Checked Build and Test (Jit - JitStressRegs=8)
@dotnet-bot test Windows_NT minopts	Queues Windows_NT x64 Checked Build and Test (Jit - JITMinOpts=1)
@dotnet-bot test Windows_NT Checked r2r_jitforcerelocs	Queues Windows_NT x64 Checked jitforcerelocs R2R Build & Test
@dotnet-bot test Windows_NT Checked r2r_jitminopts	Queues Windows_NT x64 Checked jitminopts R2R Build & Test
@dotnet-bot test Windows_NT Checked r2r_jitstress1	Queues Windows_NT x64 Checked jitstress1 R2R Build & Test
@dotnet-bot test Windows_NT Checked r2r_jitstress2	Queues Windows_NT x64 Checked jitstress2 R2R Build & Test
@dotnet-bot test Windows_NT Checked r2r_jitstressregs0x1000	Queues Windows_NT x64 Checked jitstressregs0x1000 R2R Build & Test
@dotnet-bot test Windows_NT Checked r2r_jitstressregs0x10	Queues Windows_NT x64 Checked jitstressregs0x10 R2R Build & Test
@dotnet-bot test Windows_NT Checked r2r_jitstressregs0x80	Queues Windows_NT x64 Checked jitstressregs0x80 R2R Build & Test
@dotnet-bot test Windows_NT Checked r2r_jitstressregs1	Queues Windows_NT x64 Checked jitstressregs1 R2R Build & Test
@dotnet-bot test Windows_NT Checked r2r_jitstressregs2	Queues Windows_NT x64 Checked jitstressregs2 R2R Build & Test
@dotnet-bot test Windows_NT Checked r2r_jitstressregs3	Queues Windows_NT x64 Checked jitstressregs3 R2R Build & Test
@dotnet-bot test Windows_NT Checked r2r_jitstressregs4	Queues Windows_NT x64 Checked jitstressregs4 R2R Build & Test
@dotnet-bot test Windows_NT Checked r2r_jitstressregs8	Queues Windows_NT x64 Checked jitstressregs8 R2R Build & Test
@dotnet-bot test Windows_NT tailcallstress	Queues Windows_NT x64 Checked Build and Test (Jit - TailcallStress=1)
@dotnet-bot test Windows_NT zapdisable	Queues Windows_NT x64 Checked Build and Test (Jit - ZapDisable=1 ReadyToRun=0)
@dotnet-bot test Windows_NT x86 corefx_baseline	Queues Windows_NT x86 Checked Build and Test (Jit - CoreFx)
@dotnet-bot test Windows_NT x86 corefx_jitstress1	Queues Windows_NT x86 Checked Build and Test (Jit - CoreFx JitStress=1)
@dotnet-bot test Windows_NT x86 corefx_jitstress2	Queues Windows_NT x86 Checked Build and Test (Jit - CoreFx JitStress=2)
@dotnet-bot test Windows_NT x86 corefx_jitstressregs0x1000	Queues Windows_NT x86 Checked Build and Test (Jit - CoreFx JitStressRegs=0x1000)
@dotnet-bot test Windows_NT x86 corefx_jitstressregs0x10	Queues Windows_NT x86 Checked Build and Test (Jit - CoreFx JitStressRegs=0x10)
@dotnet-bot test Windows_NT x86 corefx_jitstressregs0x80	Queues Windows_NT x86 Checked Build and Test (Jit - CoreFx JitStressRegs=0x80)
@dotnet-bot test Windows_NT x86 corefx_jitstressregs1	Queues Windows_NT x86 Checked Build and Test (Jit - CoreFx JitStressRegs=1)
@dotnet-bot test Windows_NT x86 corefx_jitstressregs2	Queues Windows_NT x86 Checked Build and Test (Jit - CoreFx JitStressRegs=2)
@dotnet-bot test Windows_NT x86 corefx_jitstressregs3	Queues Windows_NT x86 Checked Build and Test (Jit - CoreFx JitStressRegs=3)
@dotnet-bot test Windows_NT x86 corefx_jitstressregs4	Queues Windows_NT x86 Checked Build and Test (Jit - CoreFx JitStressRegs=4)
@dotnet-bot test Windows_NT x86 corefx_jitstressregs8	Queues Windows_NT x86 Checked Build and Test (Jit - CoreFx JitStressRegs=8)
@dotnet-bot test Windows_NT x86 corefx_minopts	Queues Windows_NT x86 Checked Build and Test (Jit - CoreFx JITMinOpts=1)
@dotnet-bot test Windows_NT x86 Checked forcerelocs	Queues Windows_NT x86 Checked Build and Test (Jit - ForceRelocs=1)
@dotnet-bot test Windows_NT x86 Checked gcstress0x3	Queues Windows_NT x86 Checked Build and Test (Jit - GCStress=0x3)
@dotnet-bot test Windows_NT x86 Checked gcstress0xc_jitstress1	Queues Windows_NT x86 Checked Build and Test (Jit - GCStress=0xC JitStress=1)
@dotnet-bot test Windows_NT x86 Checked gcstress0xc_jitstress2	Queues Windows_NT x86 Checked Build and Test (Jit - GCStress=0xC JitStress=2)
@dotnet-bot test Windows_NT x86 Checked gcstress0xc_minopts_heapverify1	Queues Windows_NT x86 Checked Build and Test (Jit - GCStress=0xC JITMinOpts=1 HeapVerify=1)
@dotnet-bot test Windows_NT x86 Checked gcstress0xc	Queues Windows_NT x86 Checked Build and Test (Jit - GCStress=0xC)
@dotnet-bot test Windows_NT x86 Checked gcstress0xc_zapdisable_heapverify1	Queues Windows_NT x86 Checked Build and Test (Jit - GCStress=0xC ZapDisable=1 ReadyToRun=0 HeapVerify=1)
@dotnet-bot test Windows_NT x86 Checked gcstress0xc_zapdisable_jitstress2	Queues Windows_NT x86 Checked Build and Test (Jit - GCStress=0xC ZapDisable=1 ReadyToRun=0 JitStress=2)
@dotnet-bot test Windows_NT x86 Checked gcstress0xc_zapdisable	Queues Windows_NT x86 Checked Build and Test (Jit - GCStress=0xC ZapDisable=1 ReadyToRun=0)
@dotnet-bot test Windows_NT x86 Checked heapverify1	Queues Windows_NT x86 Checked Build and Test (Jit - HeapVerify=1)
@dotnet-bot test Windows_NT x86 Checked jitsse2only	Queues Windows_NT x86 Checked Build and Test (Jit - EnableAVX=0 EnableSSE3_4=0)
@dotnet-bot test Windows_NT x86 Checked jitstress1	Queues Windows_NT x86 Checked Build and Test (Jit - JitStress=1)
@dotnet-bot test Windows_NT x86 Checked jitstress2_jitstressregs0x1000	Queues Windows_NT x86 Checked Build and Test (Jit - JitStress=2 JitStressRegs=0x1000)
@dotnet-bot test Windows_NT x86 Checked jitstress2_jitstressregs0x10	Queues Windows_NT x86 Checked Build and Test (Jit - JitStress=2 JitStressRegs=0x10)
@dotnet-bot test Windows_NT x86 Checked jitstress2_jitstressregs0x80	Queues Windows_NT x86 Checked Build and Test (Jit - JitStress=2 JitStressRegs=0x80)
@dotnet-bot test Windows_NT x86 Checked jitstress2_jitstressregs1	Queues Windows_NT x86 Checked Build and Test (Jit - JitStress=2 JitStressRegs=1)
@dotnet-bot test Windows_NT x86 Checked jitstress2_jitstressregs2	Queues Windows_NT x86 Checked Build and Test (Jit - JitStress=2 JitStressRegs=2)
@dotnet-bot test Windows_NT x86 Checked jitstress2_jitstressregs3	Queues Windows_NT x86 Checked Build and Test (Jit - JitStress=2 JitStressRegs=3)
@dotnet-bot test Windows_NT x86 Checked jitstress2_jitstressregs4	Queues Windows_NT x86 Checked Build and Test (Jit - JitStress=2 JitStressRegs=4)
@dotnet-bot test Windows_NT x86 Checked jitstress2_jitstressregs8	Queues Windows_NT x86 Checked Build and Test (Jit - JitStress=2 JitStressRegs=8)
@dotnet-bot test Windows_NT x86 Checked jitstress2	Queues Windows_NT x86 Checked Build and Test (Jit - JitStress=2)
@dotnet-bot test Windows_NT x86 Checked jitstressregs0x1000	Queues Windows_NT x86 Checked Build and Test (Jit - JitStressRegs=0x1000)
@dotnet-bot test Windows_NT x86 Checked jitstressregs0x10	Queues Windows_NT x86 Checked Build and Test (Jit - JitStressRegs=0x10)
@dotnet-bot test Windows_NT x86 Checked jitstressregs0x80	Queues Windows_NT x86 Checked Build and Test (Jit - JitStressRegs=0x80)
@dotnet-bot test Windows_NT x86 Checked jitstressregs1	Queues Windows_NT x86 Checked Build and Test (Jit - JitStressRegs=1)
@dotnet-bot test Windows_NT x86 Checked jitstressregs2	Queues Windows_NT x86 Checked Build and Test (Jit - JitStressRegs=2)
@dotnet-bot test Windows_NT x86 Checked jitstressregs3	Queues Windows_NT x86 Checked Build and Test (Jit - JitStressRegs=3)
@dotnet-bot test Windows_NT x86 Checked jitstressregs4	Queues Windows_NT x86 Checked Build and Test (Jit - JitStressRegs=4)
@dotnet-bot test Windows_NT x86 Checked jitstressregs8	Queues Windows_NT x86 Checked Build and Test (Jit - JitStressRegs=8)
@dotnet-bot test Windows_NT x86 Checked minopts	Queues Windows_NT x86 Checked Build and Test (Jit - JITMinOpts=1)
@dotnet-bot test Windows_NT x86 Checked r2r_jitforcerelocs	Queues Windows_NT x86 Checked jitforcerelocs R2R Build & Test
@dotnet-bot test Windows_NT x86 Checked r2r_jitminopts	Queues Windows_NT x86 Checked jitminopts R2R Build & Test
@dotnet-bot test Windows_NT x86 Checked r2r_jitstress1	Queues Windows_NT x86 Checked jitstress1 R2R Build & Test
@dotnet-bot test Windows_NT x86 Checked r2r_jitstress2	Queues Windows_NT x86 Checked jitstress2 R2R Build & Test
@dotnet-bot test Windows_NT x86 Checked r2r_jitstressregs0x1000	Queues Windows_NT x86 Checked jitstressregs0x1000 R2R Build & Test
@dotnet-bot test Windows_NT x86 Checked r2r_jitstressregs0x10	Queues Windows_NT x86 Checked jitstressregs0x10 R2R Build & Test
@dotnet-bot test Windows_NT x86 Checked r2r_jitstressregs0x80	Queues Windows_NT x86 Checked jitstressregs0x80 R2R Build & Test
@dotnet-bot test Windows_NT x86 Checked r2r_jitstressregs1	Queues Windows_NT x86 Checked jitstressregs1 R2R Build & Test
@dotnet-bot test Windows_NT x86 Checked r2r_jitstressregs2	Queues Windows_NT x86 Checked jitstressregs2 R2R Build & Test
@dotnet-bot test Windows_NT x86 Checked r2r_jitstressregs3	Queues Windows_NT x86 Checked jitstressregs3 R2R Build & Test
@dotnet-bot test Windows_NT x86 Checked r2r_jitstressregs4	Queues Windows_NT x86 Checked jitstressregs4 R2R Build & Test
@dotnet-bot test Windows_NT x86 Checked r2r_jitstressregs8	Queues Windows_NT x86 Checked jitstressregs8 R2R Build & Test
@dotnet-bot test Windows_NT x86 Checked tailcallstress	Queues Windows_NT x86 Checked Build and Test (Jit - TailcallStress=1)
@dotnet-bot test Windows_NT x86 Checked zapdisable	Queues Windows_NT x86 Checked Build and Test (Jit - ZapDisable=1 ReadyToRun=0)
@dotnet-bot test Debian8.4	Queues Debian8.4 x64 Release Build
@dotnet-bot test Fedora24	Queues Fedora24 x64 Release Build
@dotnet-bot test OpenSUSE42.1	Queues OpenSUSE42.1 x64 Release Build
@dotnet-bot test RHEL7.2	Queues RHEL7.2 x64 Release Build
@dotnet-bot test Ubuntu16.04 x64	Queues Ubuntu16.04 x64 Release Build
@dotnet-bot test Ubuntu16.10	Queues Ubuntu16.10 x64 Release Build
@dotnet-bot test CentOS7.1 Checked gcstress15_pri1r2r	Queues CentOS7.1 x64 Checked GCStress 15 R2R pri1 Build & Test
@dotnet-bot test CentOS7.1 Checked pri1r2r	Queues CentOS7.1 x64 Checked R2R pri1 Build & Test
@dotnet-bot test CentOS7.1 Checked r2r	Queues CentOS7.1 x64 Checked R2R pri0 Build & Test
@dotnet-bot test OSX10.12 Checked gc_reliability_framework	Queues OSX10.12 x64 Checked GC Reliability Framework
@dotnet-bot test OSX10.12 Checked gcstress15_pri1r2r	Queues OSX10.12 x64 Checked GCStress 15 R2R pri1 Build & Test
@dotnet-bot test OSX10.12 jitdiff	Queues OSX10.12 x64 Checked Jit Diff Build and Test
@dotnet-bot test OSX10.12 Checked pri1r2r	Queues OSX10.12 x64 Checked R2R pri1 Build & Test
@dotnet-bot test OSX10.12 Checked r2r	Queues OSX10.12 x64 Checked R2R pri0 Build & Test
@dotnet-bot test OSX10.12 Checked standalone_gc	Queues OSX10.12 x64 Checked Standalone GC
@dotnet-bot test Ubuntu Checked gc_reliability_framework	Queues Ubuntu x64 Checked GC Reliability Framework
@dotnet-bot test Ubuntu Checked gcstress15_pri1r2r	Queues Ubuntu x64 Checked GCStress 15 R2R pri1 Build & Test
@dotnet-bot test Ubuntu jitdiff	Queues Ubuntu x64 Checked Jit Diff Build and Test
@dotnet-bot test Ubuntu Checked pri1r2r	Queues Ubuntu x64 Checked R2R pri1 Build & Test
@dotnet-bot test Ubuntu Checked r2r	Queues Ubuntu x64 Checked R2R pri0 Build & Test
@dotnet-bot test Ubuntu Checked standalone_gc	Queues Ubuntu x64 Checked Standalone GC
@dotnet-bot test Windows_NT Checked gc_reliability_framework	Queues Windows_NT x64 Checked GC Reliability Framework
@dotnet-bot test Windows_NT Checked gcstress15_pri1r2r	Queues Windows_NT x64 Checked GCStress 15 R2R pri1 Build & Test
@dotnet-bot test Windows_NT jitdiff	Queues Windows_NT x64 Checked Jit Diff Build and Test
@dotnet-bot test Windows_NT Checked pri1r2r	Queues Windows_NT x64 Checked R2R pri1 Build & Test
@dotnet-bot test Windows_NT Checked r2r	Queues Windows_NT x64 Checked R2R pri0 Build & Test
@dotnet-bot test Windows_NT Checked standalone_gc	Queues Windows_NT x64 Checked Standalone GC
@dotnet-bot test CentOS7.1 Release gcstress15_pri1r2r	Queues CentOS7.1 x64 Release GCStress 15 R2R pri1 Build & Test
@dotnet-bot test CentOS7.1 Release pri1r2r	Queues CentOS7.1 x64 Release R2R pri1 Build & Test
@dotnet-bot test CentOS7.1 Release r2r	Queues CentOS7.1 x64 Release R2R pri0 Build & Test
@dotnet-bot test Debian8.4 pri1	Queues Debian8.4 x64 Release Pri 1 Build & Test
@dotnet-bot test OSX10.12 Release gc_reliability_framework	Queues OSX10.12 x64 Release GC Reliability Framework
@dotnet-bot test OSX10.12 Release gcsimulator	Queues OSX10.12 x64 Release GC Simulator
@dotnet-bot test OSX10.12 Release gcstress15_pri1r2r	Queues OSX10.12 x64 Release GCStress 15 R2R pri1 Build & Test
@dotnet-bot test OSX10.12 ilrt	Queues OSX10.12 x64 Release IL RoundTrip Build and Test
@dotnet-bot test OSX10.12 Release longgc	Queues OSX10.12 x64 Release Long-Running GC Build & Test
@dotnet-bot test OSX10.12 pri1	Queues OSX10.12 x64 Release Priority 1 Build and Test
@dotnet-bot test OSX10.12 Release pri1r2r	Queues OSX10.12 x64 Release R2R pri1 Build & Test
@dotnet-bot test OSX10.12 Release r2r	Queues OSX10.12 x64 Release R2R pri0 Build & Test
@dotnet-bot test OSX10.12 Release standalone_gc	Queues OSX10.12 x64 Release Standalone GC
@dotnet-bot test RHEL7.2 pri1	Queues RHEL7.2 x64 Release Pri 1 Build & Test
@dotnet-bot test Ubuntu Release gc_reliability_framework	Queues Ubuntu x64 Release GC Reliability Framework
@dotnet-bot test Ubuntu Release gcsimulator	Queues Ubuntu x64 Release GC Simulator
@dotnet-bot test Ubuntu Release gcstress15_pri1r2r	Queues Ubuntu x64 Release GCStress 15 R2R pri1 Build & Test
@dotnet-bot test Ubuntu ilrt	Queues Ubuntu x64 Release IL RoundTrip Build and Test
@dotnet-bot test Ubuntu Release longgc	Queues Ubuntu x64 Release Long-Running GC Build & Test
@dotnet-bot test Ubuntu pri1	Queues Ubuntu x64 Release Priority 1 Build and Test
@dotnet-bot test Ubuntu Release pri1r2r	Queues Ubuntu x64 Release R2R pri1 Build & Test
@dotnet-bot test Ubuntu Release r2r	Queues Ubuntu x64 Release R2R pri0 Build & Test
@dotnet-bot test Ubuntu Release standalone_gc	Queues Ubuntu x64 Release Standalone GC
@dotnet-bot test Windows_NT Release gc_reliability_framework	Queues Windows_NT x64 Release GC Reliability Framework
@dotnet-bot test Windows_NT Release gcsimulator	Queues Windows_NT x64 Release GC Simulator
@dotnet-bot test Windows_NT Release gcstress15_pri1r2r	Queues Windows_NT x64 Release GCStress 15 R2R pri1 Build & Test
@dotnet-bot test Windows_NT ilrt	Queues Windows_NT x64 Release IL RoundTrip Build and Test
@dotnet-bot test Windows_NT Release longgc	Queues Windows_NT x64 Release Long-Running GC Build & Test
@dotnet-bot test Windows_NT Release pri1r2r	Queues Windows_NT x64 Release R2R pri1 Build & Test
@dotnet-bot test Windows_NT Release r2r	Queues Windows_NT x64 Release R2R pri0 Build & Test
@dotnet-bot test Windows_NT Release standalone_gc	Queues Windows_NT x64 Release Standalone GC
@dotnet-bot test Ubuntu x86 Checked	Queues Ubuntu x86 Checked Build
@dotnet-bot test Windows_NT x86 Checked gcstress15_pri1r2r	Queues Windows_NT x86 Checked GCStress 15 R2R pri1 Build & Test
@dotnet-bot test Ubuntu x86 Debug	Queues Ubuntu x86 Debug Build
@dotnet-bot test Ubuntu x86 Release	Queues Ubuntu x86 Release Build
@dotnet-bot test Windows_NT x86 Release gcstress15_pri1r2r	Queues Windows_NT x86 Release GCStress 15 R2R pri1 Build & Test
@dotnet-bot test Windows_NT x86 Release	Queues Windows_NT x86 Release Build and Test
@dotnet-bot test Windows_NT x86 legacy_backend Checked	Queues Windows_NT x86 legacy_backend Checked Build and Test

Have a nice day!

JosephTremoulet · 2017-08-10T18:29:49Z

@dotnet-bot test Windows_NT x86 Release gcstress15_pri1r2r test Windows_NT x86 corefx_jitstressregs3 test Windows_NT x86 Checked jitstress2 test Windows_NT Checked r2r_jitstressregs8 test Windows_NT jitstress1 test Windows_NT gcstress0xc test Ubuntu jitstress2_jitstressregs1

@BruceForstall feel free to add if I missed any important ones

AndyAyersMS

See inline comments...

AndyAyersMS · 2017-08-10T18:55:19Z

src/jit/optimizer.cpp

-                        }
-                    }
+                    // The "back-edge" we identified isn't actually part of the flow cycle containing ENTRY
+                    goto NO_LOOP;


Trying to make sure I follow here -- we have a lexical "backedge" from BOTTOM to TOP, and a branch from HEAD which is before TOP to ENTRY which sits between TOP and BOTTOM (inclusive), and there is a cycle at ENTRY that does not involve BOTTOM.

So it must be the case that ENTRY != TOP and presumably we may rediscover this loop later on as HEAD moves down and we find another backedge candidate contained within this one.

Do we need a similar check for TOP? Seems like we want to ensure that TOP and BOTTOM define the lexical extent of the loop so that the range-inclusive checks later work out.

Also in pathological cases where we do bail out here, we might trace the in-loop blocks from ENTRY multiple times. Given that we are working from outer-inner the in-loop block set may be large and so we may be repeating something that is potentially costly. Is there some way to instrument and see how likely or frequent this may be? If it is frequent, could we cache the loop sets when we bail out like this?

Or could we avoid the walks and detect this case early? Say if ENTRY doesn't dominate both BOTTOM and TOP then bail out, because BOTTOM->TOP can't be an in-loop edge?

You're following correctly. Our loop detection algorithm (both before and after this change) already walks all inclusive blocks of each loop and is hence quadratic, regardless of this bail-out. We cap the number of loops we track at 16, which I thought was what prevented this from running away; I've just now discovered that the code didn't actually bail out but instead kept looking for loops that it was doomed to discard, so I've added a bail-out in the update I just pushed.

Regarding "Do we need a similar check for TOP? Seems like we want to ensure that TOP and BOTTOM define the lexical extent of the loop so that the range-inclusive checks later work out" -- yes, we did, thanks. I was thinking that finding BOTTOM in the set was sufficient because it has TOP as a successor, but I had that backwards because the walk is visiting predecessors -- so finding TOP is sufficient because it has BOTTOM as a predecessor, and I've pushed an update accordingly.

AndyAyersMS · 2017-08-10T18:58:35Z

src/jit/optimizer.cpp

+                                    if ((destNum >= top->bbNum) && (destNum <= bottom->bbNum) &&
+                                        !loopBlocks.isMember(destNum))
+                                    {
+                                        // Reversing this branch out of block `next` could confuse this algorithm, so


This bit confused me -- we're not going to change anything about the way next branches if we insert blocks before it. So should these tests be looking at moveAfter (in which case -- assuming BBJ_NONE is not possible, only the BBJ_COND check makes sense).

Here we're not happy that moveAfter splits up fall-through, so we're checking to see if next is a viable candidate to become our moveAfter -- in which case we would be inserting blocks after it. So I believe the code is correct but hard to follow. I've added a comment up where next is defined to try to clarify a bit, let me know if you think something else would help.

Ah, you have an invariant that moveAfter is always a viable (if not optimal) place to move the blocks.

Maybe the other break cases confused me -- presumably you have a range of candidate blocks to insert after that starts with BOTTOM and extends all the way to the end of the EH region that contains the blocks you'd like to move.

So for these "reversing would confuse" cases why wouldn't you keep going instead of breaking the search there and accepting the current moveAfter? And if just after BOTTOM is a viable point then presumably you could just search from there to the end for a non-fallthrough and if you find one, use it, otherwise just insert after BOTTOM.

Also note your search loop doesn't have to worry about walking off the end of the method since the last block is guaranteed to be non-fallthrough. So if you are not in an EH region you might just move the blocks to the end of the method and skip the searching all together.

Those branches would still be reversed if we moved the blocks anywhere after next.

The clause where we don't search into a different EH region is one where we could later find another viable candidate (if the blocks after the loop fall through into a new try and then back out of it), but I'm not sure it's worth the complexity to keep searching for those cases.

Regarding "if you find one, use it, otherwise just insert after BOTTOM", were you just sketching out how an algorithm that kept walking would work, or do you have a reason to think that BOTTOM is a better insertion point than the last legal block?

Those branches would still be reversed if we moved the blocks anywhere after next.

That's the bit I'm missing. If the code can find a fallthrough, then the "fixup" for the in loop blocks is something it can already handle, so I don't see yet how skipping past a problematic next to some non-fallthrough spot below it wouldn't work. Maybe you just need to be a bit more explicit on what exactly about reversal leads to confusion.

Regarding "if you find one, use it, otherwise just insert after BOTTOM", were you just sketching out how an algorithm that kept walking would work, or do you have a reason to think that BOTTOM is a better insertion point than the last legal block?

Just thinking how I would have started writing this bit. I probably would have tried to leverage fgFindInsertPoint to do the actual searching, and looking at it now, it seems like there might be logic there you need here, eg a BBJ_CALLFINALLY does not fall through but you can't safely move new blocks after it.

As far as a "best" insertion point -- not creating a new block is preferable from a TP standpoint. Other than that I would say try and avoid creating new lexically backwards branches. Insert after BOTTOM would guarantee this, though you could get the same guarantee if you moved the run of blocks before any post-loop block they reach, and if there is just one such block and maybe try and move the run just before it.

I suppose it would also be nice to avoid moving the blocks into another loop where you'll probably just end up moving them again later on, but that might be tricky to pull off or maybe doesn't happen with any frequency. You could perhaps use BBF_BACKWARD_JUMP as a screen.

None of this is all that important, if what you have now is correctly moving blocks. If so I would not worry as much about trying to make it optimal, unless you already see issues that come from particular placement choices.

Maybe you just need to be a bit more explicit on what exactly about reversal leads to confusion.

The algorithm expects that if it can find an edge from a block with higher number to a block with lower number, that it has found a lexically backward edge and therefore that walking bbNext pointers from the successor will find the predecessor. It also assumes, when walking the loop blocks, that any block it visits with number between TOP and BOTTOM is lexically between the two.
Performing any motion that reverses the lexical direction of a flow edge risks breaking those invariants.
(and the motion we do perform can mean that blocks lexically between TOP and BOTTOM in a subsequent loop have numbers less than TOP's number, but I believe that can only cause conservativism [we'll treat flow into or out of that region as exits and side-entries when we maybe didn't need to], not correctness issues)

avoid creating new lexically backwards branches

I think that's a correctness constraint, and that's what this code is trying to do.

Insert after BOTTOM would guarantee this, though ...

I started with always inserting just after bottom, but the diffs showed lots of cases where that led to ugly layout; in particular, given a small method with a search loop that has an early return and the normal loop exit falls through to the other return, it unnecessarily puts an unconditional jump on the normal exit path.

you could get the same guarantee if you moved the run of blocks before any post-loop block they reach, and if there is just one such block and maybe try and move the run just before it.

That's really what this code is trying to do. And since I can't guarantee correctness if I move past any such block, the search stops at the first such block and we move the run just before it.

I probably would have tried to leverage fgFindInsertPoint to do the actual searching, and looking at it now, it seems like there might be logic there you need here, eg a BBJ_CALLFINALLY does not fall through but you can't safely move new blocks after it.

Thanks, I didn't know about fgFindInsertPoint or that constraint; I'll have to take a look and update...

The other thought I've had: defer the actual motion to a post-pass, and just identify the set of blocks that do not belong and can be moved out. I suppose you might also need to simulate moving them...

I looked through fgFindInsertPoint and think the Call/Always pair check is the only thing to bring over, so have pushed an update that includes it.

defer the actual motion to a post-pass, and just identify the set of blocks that do not belong and can be moved out. I suppose you might also need to simulate moving them...

I considered that, but wanted to first see if it was tenable to do the in-place transform to avoid the extra heap traffic, and I'm happy with the result. And I agree, the "simulate moving them" bookkeeping might end up just as hairy.

AndyAyersMS · 2017-08-10T19:03:12Z

src/jit/optimizer.cpp

+                        // This code is lexically between TOP and BOTTOM, but it does not
+                        // participate in the flow cycle.  Check for a run of consecutive
+                        // such blocks.
+                        BasicBlock* lastExitBlock = block;


lastExitBlock sounds a lot like a name for a block that is in a loop. Maybe lastNonLoopBlock?

AndyAyersMS · 2017-08-10T19:16:05Z

src/jit/optimizer.cpp

+                BasicBlock* moveAfter = nullptr;
+                for (BasicBlock* previous = top->bbPrev; previous != bottom;)
+                {
+                    BasicBlock* block = previous->bbNext;


I wonder if this would be a clearer if it was conceptually split up as follows (with appropriate bits of state passed in and out):

for block in range [top..bottom] previous = block->bbPrev; if block is in loop: CountLoopExit(...); next = block->bbNext if block is not in loop: next = MoveBlockRangeOutOfLoopOrTolerateInLoop(...) // ret null if fails

I agree the function is more monolithic than ideal, so I'll see if I can find an appropriate way to refactor, but at first blush I don't think it would end up reading better, given the amount of state that would need to get passed in and particularly that would need to get passed out, not to mention I'd need to pull the LoopBlockSet definition out of line.

I wonder if this would be a clearer if it was conceptually split up...

I agree the function is more monolithic than ideal... the amount of state that would need to get passed in and particularly that would need to get passed out,..

Of course, I could wrap that state up in a class and pass it around implicitly. @AndyAyersMS, let me know if you think JosephTremoulet@0861de1 is an improvement and I can clean it up and push it to the PR branch.

Thanks for considering this -- I think it is an improvement and really helps clarify what is going on.

AndyAyersMS · 2017-08-10T19:17:09Z

src/jit/optimizer.cpp

+
+                        if (moveAfter->bbFallsThrough())
+                        {
+                            // We've just inserted blocks between moveAfter and moveBefore which it was supposed


Would be clearer to me if moveBefore was actually a thing here. It took me a while to figure out that lastExitBlock->bbNext was the block that moveAfter falls through to originally.

Good point, defined moveBefore in the update.

briansull · 2017-08-10T19:46:30Z

src/jit/optimizer.cpp

-                            exitPoint = loopBlock->bbJumpDest;
+                        if (blockNum > oldBlockMaxNum)
+                        {
+                            BlockSetOps::AddElemD(comp, newBlocksInLoop, blockNum - oldBlockMaxNum);


You may want to add a comment explaining that newBlocksInLoop will handle only cover one extra block per existing block, and is represented using (blockNum - oldBlockMaxNum)

briansull · 2017-08-10T19:50:56Z

src/jit/optimizer.cpp

                {
-                    if (predEntry->flBlock->bbNum >= top->bbNum && predEntry->flBlock->bbNum <= bottom->bbNum)
+                    BasicBlock* block = worklist.back();
+                    worklist.pop_back();


Can the worklist contain blocks that are (blockNum > oldBlockMaxNum) ?
If so, fgDominate won't work and you will have to use positionNum []

It can; good catch, thanks. Updated, but rather than using positionNum (since fgDominate takes the blocks), I just defer the check until processing the predecessor.

briansull · 2017-08-10T19:56:27Z

src/jit/optimizer.cpp

@@ -1483,12 +1483,17 @@ void Compiler::optFindNaturalLoops()

     */

+    bool          mod = false;
    BasicBlock*   head;
    BasicBlock*   top;
    BasicBlock*   bottom;
    BasicBlock*   entry;
    BasicBlock*   exit;


We may want to rename this local: exit as it appears to be colored as if it is a keyword.

Done. I separated lastExit and onlyExit while at it, which makes it read more clearly I think.

briansull · 2017-08-10T19:58:35Z

tests/src/JIT/Regression/JitBlue/GitHub_9692/GitHub_9692.cs

+        }
+
+        // Can't move out of loop without crossing try region boundary; should leave in loop
+        // Lopo should still be recognized, and invariant multiplication hoisted out of it.


spelling: Lopo

JosephTremoulet · 2017-08-11T21:09:06Z

I pushed an update to fix an issue found in desktop testing where creating a goto-next caused an assertion failure. If desktop and normal CI testing are green with the fix, I'll squash the updates and re-trigger stress runs.

AndyAyersMS · 2017-08-14T14:49:43Z

src/jit/optimizer.cpp

+                                continue;
+                            }
+
+                            // There are multiple entries to this loop, don't consider it.


This "side entry" could also be from a disjoint part of the loop. It might be interesting someday to work through relaxing this condition. Something like: let the pred closure happen, since it seems like the new dominates check should prevent wandering outside the loop, and see how many disjoint bodies there are out there.

If these are common, we may now have the machinery to compact them too.

JosephTremoulet · 2017-08-15T15:44:09Z

Updated to include refactoring.

JosephTremoulet · 2017-08-15T17:13:24Z

@dotnet-bot re-test OSX10.12 x64 Checked Build and Test (infrastructure failure)

JosephTremoulet · 2017-08-16T03:49:16Z

@dotnet-bot test Windows_NT x86 Release gcstress15_pri1r2r test Windows_NT x86 corefx_jitstressregs3 test Windows_NT x86 Checked jitstress2 test Windows_NT Checked r2r_jitstressregs8 test Windows_NT jitstress1 test Windows_NT gcstress0xc test Ubuntu jitstress2_jitstressregs1

briansull · 2017-08-16T17:22:37Z

src/jit/optimizer.cpp

+//             but not the outer loop. ???)
+//   TOP     - the target of the backward edge from BOTTOM. In most cases FIRST and TOP are the same.
+//   BOTTOM  - the lexically last block in the loop (i.e. the block from which we jump to the top)
+//   EXIT    - the predecessor of loop's unique exit edge, if it has a unique exit edge; else nullptr


Should this be changed to LastExit?

briansull · 2017-08-16T17:24:48Z

src/jit/compiler.h

+                       BasicBlock*   top,
+                       BasicBlock*   entry,
+                       BasicBlock*   bottom,
+                       BasicBlock*   exit,


Should this be changed to lastexit?

briansull · 2017-08-16T17:25:28Z

src/jit/optimizer.cpp

 */

-void Compiler::optRecordLoop(BasicBlock*   head,
+bool Compiler::optRecordLoop(BasicBlock*   head,


exit => lastexit?

It shouldn't be lastExit because that is just something we track while identifying the loop and then throw it away if there were multiple exits. The code that checks for multiples has a local called onlyExit that it populates iff lastExit was unique. So if we want to change the name of this parameter and the comments describing loop structure and the lpExit field in the loop table, we could change it to onlyExit or perhaps better uniqueExit, but that seems unnecessarily verbose to me...

JosephTremoulet · 2017-08-16T21:09:40Z

I pushed a fix for an issue discovered by stress testing.

BruceForstall · 2017-08-16T21:39:43Z

src/jit/optimizer.cpp

+//   HEAD    - the basic block that flows into the loop ENTRY block (Currently MUST be lexically before entry).
+//             Not part of the looping of the loop.
+//   FIRST   - the lexically first basic block (in bbNext order) within this loop.  (May be part of a nested loop,
+//             but not the outer loop. ???)


???) [](start = 39, length = 4)

You copied the "???" from before, but since you've been working on this code: do you know the answer?

I don't understand the question. Is the question whether fist may be part of "a nested loop", and also whether it may be part of "the outer loop"? If so, "nested" and "outer" with respect to what? Why is it "a nested loop" vs. "the outer loop"? If we're talking about the full loop nest forest, then there aren't any restrictions on how deeply nested a loop with a "first" is. If we're talking about whether a given loop's "first" may be included in any "outer" loops enclosing the given one, then yes of course it is a member of all of them. If we're talking about whether a given loop's "first" may be included in any "nested" loops enclosed by the given one, then:

as far as LoopSearch is concerned, there's nothing conceptually barring that

there's code called after loopSearch, but still within optFindNaturalLoops, that canonicalizes loops; its comments say the point is to have no two loops share top but then the implementation talks about sharing fist

in the current implementation, first and top are always the same, and the search algorithm in LoopSearch will never find two loops that share one

So I don't know what was supposed to be said here. Should I just remove the whole parenthetical?

Should I just remove the whole parenthetical?

I'd be ok with that.

Perhaps a discussion somehere (here?) about the relationship between all the concepts here (FIRST/TOP/etc.) and nested loops. e.g., can a FIRST be the FIRST of multiple loops?

Clarified comments in 73f1167

Thanks for prompting me to take a closer look at this. I had misread and thought the old code wouldn't identify loops that shared a TOP or ENTRY or FIRST -- on closer read, it allowed it as long as entry's predecessor list had an inner-loop predecessor prior to having an outer-loop predecessor (oddly). So I've fixed my code to recognize nested loops that share TOP/FIRST/ENTRY in 13bcb53 (before refactoring) and 1091e89 + a9c20d2 (after refactoring), without the bizarro restriction on predecessor list.

Yeah, this is the whole "keep the pred list sorted" thing that came up over in #13322. If loop recognition no longer needs this then maybe we can drop the whole thing.

BruceForstall · 2017-08-16T21:41:18Z

src/jit/optimizer.cpp

+//        v
+//      head
+//        |
+//        |    top/beg <--+


Should "beg" in "top/beg" be "first"? Should there be an example where TOP and FIRST are not the same?

Probably should be "first". We do not currently have any examples where TOP and FIRST are not the same.

Fixed in 73f1167

This is mainly done to increase readability, as `optFindNaturalLoops` had grown excessively large. It also facilitates re-using code to fix up fallthrough, and skipping past CallFinally/BBJ_ALWAYS pairs rather than aborting once they're found.

JosephTremoulet · 2017-08-18T10:18:35Z

@dotnet-bot test Windows_NT x86 corefx_jitstressregs3 test Windows_NT x86 Checked jitstress2 test Windows_NT Checked r2r_jitstressregs8 test Windows_NT jitstress1 test Ubuntu jitstress2_jitstressregs1

JosephTremoulet · 2017-08-18T12:55:07Z

Stress failure matches baseline

JosephTremoulet · 2017-08-18T14:30:40Z

Memstats reports 0.1% more allocated bytes in release System.Private.Corelib crossgen:

diff --git "a/memstats-base.txt" "b/memstats-diff.txt"
index baa2dee93..9650c0d99 100644
--- "a/memstats-base.txt"
+++ "b/memstats-diff.txt"
@@ -1,133 +1,133 @@
 
 D:\Source\coreclr>D:\Source\coreclr\bin\Product\Windows_NT.x64.Release\crossgen.exe /Platform_Assemblies_Paths D:\Source\coreclr\bin\Product\Windows_NT.x64.Release\IL /out D:\Source\coreclr\bin\Product\Windows_NT.x64.Release\System.Private.CoreLib.dll D:\Source\coreclr\bin\Product\Windows_NT.x64.Release\IL\System.Private.CoreLib.dll 
 
 All allocations:
 For     26291 methods:
-  count:           20198079 (avg     768 per method)
+  count:           20269450 (avg     770 per method)
-  alloc size :   1813175957 (avg   68965 per method)
+  alloc size :   1815665942 (avg   69060 per method)
   max alloc  :        86912
 
-  allocateMemory   :   3336175616 (avg  126894 per method)
+  allocateMemory   :   3338403840 (avg  126978 per method)
-  nraUsed    :   2546966432 (avg   96875 per method)
+  nraUsed    :   2549453232 (avg   96970 per method)
 
 Alloc'd bytes by kind:
                   kind |       size |     pct
   ---------------------+------------+--------
-         AssertionProp |  170463080 |   9.40%
+         AssertionProp |  170463080 |   9.39%
-               ASTNode |  332156679 |  18.32%
+               ASTNode |  332244447 |  18.30%
-              InstDesc |   49788288 |   2.75%
+              InstDesc |   49833836 |   2.74%
               ImpStack |     498576 |   0.03%
-            BasicBlock |   59676128 |   3.29%
+            BasicBlock |   59727608 |   3.29%
-             fgArgInfo |   10293192 |   0.57%
+             fgArgInfo |   10293752 |   0.57%
-       fgArgInfoPtrArr |    1470456 |   0.08%
+       fgArgInfoPtrArr |    1470536 |   0.08%
-              FlowList |    7445776 |   0.41%
+              FlowList |    8109200 |   0.45%
-     TreeStatementList |    1820768 |   0.10%
+     TreeStatementList |    1823072 |   0.10%
-               SiScope |    8998360 |   0.50%
+               SiScope |    8832344 |   0.49%
         FlatFPStateX87 |          0 |   0.00%
-       DominatorMemory |    5762000 |   0.32%
+       DominatorMemory |    6364728 |   0.35%
-                  LSRA |   88207680 |   4.86%
+                  LSRA |   88227684 |   4.86%
-         LSRA_Interval |   55517616 |   3.06%
+         LSRA_Interval |   55520168 |   3.06%
-      LSRA_RefPosition |  171685080 |   9.47%
+      LSRA_RefPosition |  171686480 |   9.46%
           Reachability |     504992 |   0.03%
-                   SSA |   43350424 |   2.39%
+                   SSA |   43346104 |   2.39%
-           ValueNumber |  338346518 |  18.66%
+           ValueNumber |  338356155 |  18.64%
-              LvaTable |   82502848 |   4.55%
+              LvaTable |   82502872 |   4.54%
-            UnwindInfo |     191968 |   0.01%
+            UnwindInfo |     191744 |   0.01%
-                hashBv |    5633656 |   0.31%
+                hashBv |    5633568 |   0.31%
-                bitset |   28625144 |   1.58%
+                bitset |   28714696 |   1.58%
           FixedBitVect |       1056 |   0.00%
-          AsIAllocator |  150184316 |   8.28%
+          AsIAllocator |  150910712 |   8.31%
         IndirAssignMap |      85328 |   0.00%
          FieldSeqStore |    4782488 |   0.26%
     ZeroOffsetFieldMap |    1283880 |   0.07%
-          ArrayInfoMap |    2489776 |   0.14%
+          ArrayInfoMap |    2490016 |   0.14%
-          MemoryPhiArg |     945952 |   0.05%
+          MemoryPhiArg |     946720 |   0.05%
-                   CSE |   38697824 |   2.13%
+                   CSE |   38698800 |   2.13%
-                    GC |   65404271 |   3.61%
+                    GC |   65393663 |   3.60%
                 CorSig |    6770920 |   0.37%
-              Inlining |   36225376 |   2.00%
+              Inlining |   36172960 |   1.99%
-            ArrayStack |   13125504 |   0.72%
+            ArrayStack |   13500672 |   0.74%
-             DebugInfo |    9907920 |   0.55%
+             DebugInfo |    9911400 |   0.55%
              DebugOnly |          0 |   0.00%
                Codegen |          0 |   0.00%
-               LoopOpt |      67200 |   0.00%
+               LoopOpt |      67440 |   0.00%
-             LoopHoist |    1586920 |   0.09%
+             LoopHoist |    1621912 |   0.09%
-               Unknown |   18677997 |   1.03%
+               Unknown |   18682333 |   1.03%
 
 
 Largest method:
-count:      78538, size:    4085578, max =      59992
+count:      80075, size:    4146298, max =      59992
-allocateMemory:    4259840, nraUsed:    4153976
+allocateMemory:    4325376, nraUsed:    4214760
 
 Alloc'd bytes by kind:
                   kind |       size |     pct
   ---------------------+------------+--------
          AssertionProp |       6460 |   0.16%
-               ASTNode |    1033792 |  25.30%
+               ASTNode |    1035328 |  24.97%
-              InstDesc |      68804 |   1.68%
+              InstDesc |      68804 |   1.66%
               ImpStack |        192 |   0.00%
-            BasicBlock |      88272 |   2.16%
+            BasicBlock |      90384 |   2.18%
-             fgArgInfo |      22064 |   0.54%
+             fgArgInfo |      22064 |   0.53%
        fgArgInfoPtrArr |       3152 |   0.08%
-              FlowList |      62640 |   1.53%
+              FlowList |      73232 |   1.77%
-     TreeStatementList |      10528 |   0.26%
+     TreeStatementList |      10528 |   0.25%
                SiScope |       8960 |   0.22%
         FlatFPStateX87 |          0 |   0.00%
-       DominatorMemory |      41120 |   1.01%
+       DominatorMemory |      48048 |   1.16%
-                  LSRA |     114708 |   2.81%
+                  LSRA |     117876 |   2.84%
-         LSRA_Interval |     141944 |   3.47%
+         LSRA_Interval |     141944 |   3.42%
-      LSRA_RefPosition |     446656 |  10.93%
+      LSRA_RefPosition |     446656 |  10.77%
           Reachability |         32 |   0.00%
-                   SSA |      55236 |   1.35%
+                   SSA |      55428 |   1.34%
-           ValueNumber |     342067 |   8.37%
+           ValueNumber |     342491 |   8.26%
-              LvaTable |      82776 |   2.03%
+              LvaTable |      82776 |   2.00%
             UnwindInfo |         32 |   0.00%
                 hashBv |       5400 |   0.13%
-                bitset |     635680 |  15.56%
+                bitset |     644168 |  15.54%
           FixedBitVect |          0 |   0.00%
-          AsIAllocator |     660700 |  16.17%
+          AsIAllocator |     685644 |  16.54%
         IndirAssignMap |          0 |   0.00%
          FieldSeqStore |        440 |   0.01%
     ZeroOffsetFieldMap |         64 |   0.00%
-          ArrayInfoMap |      15800 |   0.39%
+          ArrayInfoMap |      15800 |   0.38%
-          MemoryPhiArg |       3184 |   0.08%
+          MemoryPhiArg |       3216 |   0.08%
                    CSE |      18528 |   0.45%
-                    GC |      96796 |   2.37%
+                    GC |      96796 |   2.33%
                 CorSig |       8528 |   0.21%
-              Inlining |      25680 |   0.63%
+              Inlining |      25680 |   0.62%
-            ArrayStack |      40448 |   0.99%
+            ArrayStack |      42624 |   1.03%
              DebugInfo |       9576 |   0.23%
              DebugOnly |          0 |   0.00%
                Codegen |          0 |   0.00%
-               LoopOpt |       4320 |   0.11%
+               LoopOpt |       4320 |   0.10%
              LoopHoist |       7928 |   0.19%
-               Unknown |      23071 |   0.56%
+               Unknown |      23199 |   0.56%
 
 
 ---------------------------------------------------
 Distribution of total memory allocated per method (in KB):
      <=         20 ===>       0 count (  0% of total)
      21 ..      50 ===>       0 count (  0% of total)
      51 ..      75 ===>   11123 count ( 42% of total)
      76 ..     100 ===>       0 count ( 42% of total)
-    101 ..     150 ===>   11091 count ( 84% of total)
+    101 ..     150 ===>   11079 count ( 84% of total)
-    151 ..     250 ===>    2400 count ( 93% of total)
+    151 ..     250 ===>    2407 count ( 93% of total)
-    251 ..     500 ===>    1455 count ( 99% of total)
+    251 ..     500 ===>    1459 count ( 99% of total)
     501 ..    1000 ===>     151 count ( 99% of total)
-   1001 ..    5000 ===>      71 count (100% of total)
+   1001 ..    5000 ===>      72 count (100% of total)
 
 ---------------------------------------------------
 Distribution of total memory used      per method (in KB):
      <=         20 ===>       0 count (  0% of total)
      21 ..      50 ===>    6051 count ( 23% of total)
      51 ..      75 ===>    8470 count ( 55% of total)
-     76 ..     100 ===>    4944 count ( 74% of total)
+     76 ..     100 ===>    4937 count ( 74% of total)
-    101 ..     150 ===>    3999 count ( 89% of total)
+    101 ..     150 ===>    3997 count ( 89% of total)
-    151 ..     250 ===>    1964 count ( 96% of total)
+    151 ..     250 ===>    1971 count ( 96% of total)
-    251 ..     500 ===>     683 count ( 99% of total)
+    251 ..     500 ===>     675 count ( 99% of total)
-    501 ..    1000 ===>     109 count ( 99% of total)
+    501 ..    1000 ===>     119 count ( 99% of total)
    1001 ..    5000 ===>      71 count (100% of total)
 Microsoft (R) CoreCLR Native Image Generator - Version 4.5.22220.0
 Copyright (c) Microsoft Corporation.  All rights reserved.
 
 Native image D:\Source\coreclr\bin\Product\Windows_NT.x64.Release\System.Private.CoreLib.dll generated successfully.

JosephTremoulet · 2017-08-18T15:29:31Z

I've verified there are no SuperPMI desktop asm diffs for the refactoring (99edc2b).

BruceForstall · 2017-08-18T15:33:42Z

Awesome. Thanks.

Remove some `goto`s that were added to work around #9692 (poor code layout for loop exit paths) -- the JIT's layout decisions were improved in dotnet#13314, and these particular `goto`s are no longer needed; crossgen of System.Private.CoreLib now produces the same machine code with or without this change. Part of #13466.

Remove some `goto`s that were added to work around #9692 (poor code layout for loop exit paths) -- the JIT's layout decisions were improved in #13314, and these particular `goto`s are no longer needed; crossgen of System.Private.CoreLib now produces the same machine code with or without this change. Part of #13466.

Remove some `goto`s that were added to work around dotnet/coreclr#9692 (poor code layout for loop exit paths) -- the JIT's layout decisions were improved in dotnet/coreclr#13314, and these particular `goto`s are no longer needed; the same machine code is generated with or without this change. Some `goto`s previously tagged as workarounds for dotnet/coreclr#9692 are still relevant for keeping codesize down pending dotnet/coreclr#13549; update their comments accordingly. Part of #23395.

JosephTremoulet · 2017-08-29T21:58:22Z

@redknightlois, this has made it into preview feeds as of Microsoft.NETCore.App 2.1.0-preview2-25628-01 -- with <TargetFramework>netcoreapp2.1</TargetFramework> and <RuntimeFrameworkVersion>2.1.0-preview2-25628-01</RuntimeFrameworkVersion> in your .csproj and <add key="dotnet core" value="https://dotnet.myget.org/F/dotnet-core/api/v3/index.json" /> in your NuGet.config, you can try it out.

Remove some `goto`s that were added to work around undesirable jit layout (#9692, fixed in dotnet#13314) and epilog factoring (improved in dotnet#13792 and dotnet#13903), which are no longer needed. Resolves #13466.

Remove some `goto`s that were added to work around undesirable jit layout (#9692, fixed in #13314) and epilog factoring (improved in #13792 and #13903), which are no longer needed. Resolves #13466.

JosephTremoulet requested review from AndyAyersMS and briansull August 10, 2017 14:57

dnfclas added the cla-already-signed label Aug 10, 2017

JosephTremoulet force-pushed the loops branch from c3c9be5 to 2c598e5 Compare August 10, 2017 18:20

dotnet deleted a comment from dotnet-bot Aug 10, 2017

AndyAyersMS reviewed Aug 10, 2017

View reviewed changes

briansull reviewed Aug 10, 2017

View reviewed changes

JosephTremoulet force-pushed the loops branch from d6fd2b0 to 00ea5af Compare August 10, 2017 21:04

AndyAyersMS reviewed Aug 14, 2017

View reviewed changes

stephentoub mentioned this pull request Aug 14, 2017

HttpClient: several HeaderDescriptor-related improvements dotnet/corefx#23186

Merged

JosephTremoulet force-pushed the loops branch from 567bc81 to 2f9be5a Compare August 15, 2017 15:40

JosephTremoulet force-pushed the loops branch from 2f9be5a to e0fcbc9 Compare August 15, 2017 16:07

JosephTremoulet force-pushed the loops branch from e0fcbc9 to 9d8c083 Compare August 16, 2017 03:47

briansull reviewed Aug 16, 2017

View reviewed changes

briansull approved these changes Aug 16, 2017

View reviewed changes

BruceForstall reviewed Aug 16, 2017

View reviewed changes

JosephTremoulet added 2 commits August 17, 2017 19:36

Refactor loop identification into a class

99edc2b

This is mainly done to increase readability, as `optFindNaturalLoops` had grown excessively large. It also facilitates re-using code to fix up fallthrough, and skipping past CallFinally/BBJ_ALWAYS pairs rather than aborting once they're found.

Add perf test

8127531

JosephTremoulet force-pushed the loops branch from 29ce6e1 to 8127531 Compare August 17, 2017 23:40

JosephTremoulet merged commit 46bfc27 into dotnet:master Aug 18, 2017

JosephTremoulet deleted the loops branch August 18, 2017 16:25

JosephTremoulet mentioned this pull request Aug 21, 2017

Undo a few JIT layout workarounds #13505

Merged

JosephTremoulet mentioned this pull request Aug 22, 2017

Maintain sorted preds part 1 #13322

Closed

JosephTremoulet mentioned this pull request Aug 23, 2017

Undo a few JIT layout workarounds dotnet/corefx#23510

Merged

karelz modified the milestone: 2.1.0 Aug 28, 2017

benaadams mentioned this pull request Sep 1, 2017

Discussion: C# Break nested loop dotnet/csharplang#869

Closed

briansull mentioned this pull request Sep 12, 2017

Update CoreSetup to preview1-25719-04 (master) dotnet/cli#7606

Merged

JosephTremoulet mentioned this pull request Sep 13, 2017

Undo more JIT layout workarounds #13961

Merged

Gnbrkm41 mentioned this pull request Nov 26, 2018

break and continue inhancements dotnet/csharplang#2024

Closed

This was referenced Jan 31, 2020

Undo "goto return" works-around dotnet/runtime#8770

Closed

Undo "goto return" works-around dotnet/runtime#23252

Closed

Lay out loop bodies contiguously #13314

Lay out loop bodies contiguously #13314

Conversation

JosephTremoulet commented Aug 10, 2017

JosephTremoulet commented Aug 10, 2017

redknightlois commented Aug 10, 2017

stephentoub commented Aug 10, 2017

BruceForstall commented Aug 10, 2017

JosephTremoulet commented Aug 10, 2017

JosephTremoulet commented Aug 10, 2017

dotnet-bot commented Aug 10, 2017

JosephTremoulet commented Aug 10, 2017

AndyAyersMS left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JosephTremoulet commented Aug 11, 2017

Choose a reason for hiding this comment

JosephTremoulet commented Aug 15, 2017

JosephTremoulet commented Aug 15, 2017

JosephTremoulet commented Aug 16, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JosephTremoulet commented Aug 16, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JosephTremoulet commented Aug 18, 2017

JosephTremoulet commented Aug 18, 2017

JosephTremoulet commented Aug 18, 2017 • edited Loading

JosephTremoulet commented Aug 18, 2017

BruceForstall commented Aug 18, 2017

JosephTremoulet commented Aug 29, 2017

JosephTremoulet commented Aug 18, 2017 •

edited

Loading