Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements for null check folding. #1735

Merged
merged 1 commit into from
Jan 17, 2020

Conversation

erozenfeld
Copy link
Member

@erozenfeld erozenfeld commented Jan 14, 2020

optFoldNullChecks attempts to remove GT_NULLCHECK nodes that are
post-dominated by indirections on the same variable. These changes
implement a number of improvements.

  1. Recognize more patterns.
    Before these changes only the following pattern was recognized:

    x = comma(nullcheck(y), add(y, const1))

    followed by

    indir(add(x, const2))

    where const1 + const2 is sufficiently small.

    With these changes the following patterns are recognized:

    nullcheck(x)
    or
    x = comma(nullcheck(y), add(y, const1))

    followed by

    indir(x)
    or
    indir(add(x, const2))

    where const1 + const2 is sufficiently small.

  2. Indirections now include GT_ARR_LENGTH nodes.

  3. Morph has an optimization
    ((x+icon1)+icon2) => (x+(icon1+icon2))
    These changes generalize it to handle commas:
    ((comma(y, x+icon1)+icon2) => comma(y, x+(icon1+icon2))

    That exposes more trees to null check folding.

  4. Fix a bug in flow transformations that could lose BBF_HAS_NULLCHECK flag
    on some basic blocks, which led to missing opportunities for null check folding.

  5. Make safety checks in optCanMoveNullCheckPastTree
    (for trees between the GT_NULLCHECK and the indirection) both more correct
    and less conservative. For example, we were not allowing any assignments
    if we were inside try; however, assignments to compiler temps are safe since
    they won't be visible in handlers.

  6. Increase the maximum number of trees we check between GT_NULLCHECK and
    the indirection from 25 to 50.

  7. Refactor the code and move pattern recognition and safety checks to
    helper methods.

This addresses all relevant examples from https://github.com/dotnet/coreclr/issues/23903 .

@erozenfeld
Copy link
Member Author

Framework diffs:

PMI CodeSize Diffs for System.Private.CoreLib.dll, framework assemblies for  default jit
Summary of Code Size diffs:
(Lower is better)
Total bytes of diff: -6966 (-0.01% of base)
    diff is an improvement.
Top file regressions (bytes):
           6 : System.Net.Mail.dasm (0.00% of base)
Top file improvements (bytes):
       -1307 : System.Text.RegularExpressions.dasm (-0.50% of base)
       -1247 : System.Private.CoreLib.dasm (-0.03% of base)
        -462 : System.Text.Json.dasm (-0.11% of base)
        -365 : System.Private.Xml.dasm (-0.01% of base)
        -266 : Microsoft.CodeAnalysis.dasm (-0.02% of base)
        -260 : System.Reflection.Metadata.dasm (-0.06% of base)
        -200 : System.Reflection.MetadataLoadContext.dasm (-0.11% of base)
        -183 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (-0.01% of base)
        -146 : System.Net.Quic.dasm (-0.27% of base)
        -139 : System.Security.Cryptography.Pkcs.dasm (-0.03% of base)
        -130 : Microsoft.CodeAnalysis.CSharp.dasm (-0.00% of base)
        -106 : System.Collections.dasm (-0.02% of base)
         -88 : System.Linq.dasm (-0.01% of base)
         -80 : System.Numerics.Tensors.dasm (-0.03% of base)
         -72 : System.Memory.dasm (-0.03% of base)
         -71 : System.Net.Security.dasm (-0.04% of base)
         -70 : System.Transactions.Local.dasm (-0.06% of base)
         -66 : System.IO.FileSystem.dasm (-0.06% of base)
         -66 : System.Security.Cryptography.Cng.dasm (-0.04% of base)
         -60 : Microsoft.CodeAnalysis.VisualBasic.dasm (-0.00% of base)
124 total files with Code Size differences (123 improved, 1 regressed), 41 unchanged.
Top method regressions (bytes):
          21 ( 0.92% of base) : System.Data.SqlClient.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          21 ( 0.91% of base) : System.Net.Http.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          21 ( 0.91% of base) : System.Net.HttpListener.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          21 ( 0.91% of base) : System.Net.Mail.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          21 ( 0.91% of base) : System.Net.Security.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          12 ( 0.36% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ParseAttributeValueSlow(int,ushort,NodeData):this
           8 ( 3.45% of base) : System.Private.CoreLib.dasm - Object:MemberwiseClone():Object:this (2 methods)
           7 ( 1.02% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:HandleEntityReference(bool,int,byref):int:this
           5 ( 0.69% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ParseDoctypeDecl():bool:this
           3 ( 0.21% of base) : System.Net.Http.dasm - <ReadAsync>d__2:MoveNext():this
           3 ( 0.21% of base) : System.Net.Http.dasm - <ReadAsync>d__6:MoveNext():this
           1 ( 0.55% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ParseEntityReference():this
Top method improvements (bytes):
       -1011 (-6.18% of base) : System.Text.RegularExpressions.dasm - RegexWriter:EmitFragment(int,RegexNode,int):this (3 methods)
        -262 (-5.38% of base) : System.Private.CoreLib.dasm - ConcurrentQueue`1:get_Count():int:this (7 methods)
        -215 (-3.13% of base) : System.Private.CoreLib.dasm - Number:NumberToStringFormat(byref,byref,ReadOnlySpan`1,NumberFormatInfo)
        -121 (-3.27% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ParseXmlDeclaration(bool):bool:this
         -97 (-5.13% of base) : System.Private.CoreLib.dasm - ConcurrentQueueSegment`1:TryDequeue(byref):bool:this (7 methods)
         -91 (-5.62% of base) : System.Private.CoreLib.dasm - ConcurrentQueueSegment`1:TryPeek(byref,bool):bool:this (7 methods)
         -84 (-2.05% of base) : System.Text.RegularExpressions.dasm - RegexWriter:RegexCodeFromRegexTree(RegexTree):RegexCode:this (3 methods)
         -56 (-3.33% of base) : System.Private.CoreLib.dasm - ConcurrentQueue`1:SnapForObservation(byref,byref,byref,byref):this (7 methods)
         -53 (-3.61% of base) : System.Net.Security.dasm - NegotiateStreamPal:Encrypt(SafeDeleteContext,ref,int,int,bool,bool,byref,int):int
         -51 (-2.68% of base) : System.Reflection.MetadataLoadContext.dasm - EcmaAssembly:ComputeAssemblyReferences():ref:this
         -48 (-0.38% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - RegisteredTraceEventParser:GetManifestForRegisteredProvider(Guid):String
         -42 (-3.75% of base) : System.Net.Quic.dasm - ResettableCompletionSource`1:System.Threading.Tasks.Sources.IValueTaskSource.GetResult(short):this (7 methods)
         -42 (-3.87% of base) : System.Private.CoreLib.dasm - ConcurrentQueue`1:GetCount(ConcurrentQueueSegment`1,int,ConcurrentQueueSegment`1,int):long (7 methods)
         -39 (-4.64% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ParseEndElement():this
         -36 (-0.38% of base) : System.Collections.dasm - Enumerator:System.Collections.IEnumerator.get_Current():Object:this (63 methods)
         -36 (-7.45% of base) : System.Text.RegularExpressions.dasm - RegexWriter:Emit(int,int):this (3 methods)
         -34 (-2.64% of base) : Microsoft.CodeAnalysis.dasm - MetadataReaderExtensions:GetReferencedAssembliesOrThrow(MetadataReader):ImmutableArray`1
         -29 (-1.83% of base) : System.Private.CoreLib.dasm - Guid:TryParseExactX(ReadOnlySpan`1,byref):bool
         -28 (-2.05% of base) : System.Private.CoreLib.dasm - MemberInfoCache`1:Insert(byref,String,int):this (2 methods)
         -28 (-0.62% of base) : System.Private.CoreLib.dasm - <Enumerate>d__26:MoveNext():bool:this (7 methods)
Top method regressions (percentages):
           8 ( 3.45% of base) : System.Private.CoreLib.dasm - Object:MemberwiseClone():Object:this (2 methods)
           7 ( 1.02% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:HandleEntityReference(bool,int,byref):int:this
          21 ( 0.92% of base) : System.Data.SqlClient.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          21 ( 0.91% of base) : System.Net.Http.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          21 ( 0.91% of base) : System.Net.HttpListener.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          21 ( 0.91% of base) : System.Net.Mail.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          21 ( 0.91% of base) : System.Net.Security.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
           5 ( 0.69% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ParseDoctypeDecl():bool:this
           1 ( 0.55% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ParseEntityReference():this
          12 ( 0.36% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ParseAttributeValueSlow(int,ushort,NodeData):this
           3 ( 0.21% of base) : System.Net.Http.dasm - <ReadAsync>d__2:MoveNext():this
           3 ( 0.21% of base) : System.Net.Http.dasm - <ReadAsync>d__6:MoveNext():this
Top method improvements (percentages):
          -6 (-37.50% of base) : System.Transactions.Local.dasm - EnlistableStates:CompleteAbortingClone(InternalTransaction):this
          -6 (-37.50% of base) : System.Transactions.Local.dasm - EnlistableStates:CreateBlockingClone(InternalTransaction):this
          -6 (-37.50% of base) : System.Transactions.Local.dasm - EnlistableStates:CreateAbortingClone(InternalTransaction):this
          -6 (-37.50% of base) : System.Transactions.Local.dasm - TransactionStatePromotedNonMSDTCBase:CompleteAbortingClone(InternalTransaction):this
          -6 (-37.50% of base) : System.Transactions.Local.dasm - TransactionStatePromotedNonMSDTCBase:CreateBlockingClone(InternalTransaction):this
          -6 (-37.50% of base) : System.Transactions.Local.dasm - TransactionStatePromotedNonMSDTCBase:CreateAbortingClone(InternalTransaction):this
          -2 (-33.33% of base) : System.Private.CoreLib.dasm - RuntimeHelpers:GetMethodTable(Object):long
          -6 (-25.00% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:OnNewLine(int):this
          -2 (-22.22% of base) : System.Private.CoreLib.dasm - RuntimeHelpers:GetElementSize(Array):ushort
          -2 (-20.00% of base) : Microsoft.CodeAnalysis.dasm - ILBuilder:get_InstructionsEmitted():int:this
          -2 (-20.00% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - TraceThreads:get_Count():int:this
          -2 (-20.00% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - TraceCallStacks:get_Count():int:this
          -2 (-20.00% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - TraceCodeAddresses:get_Count():int:this
          -2 (-20.00% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - TraceMethods:get_Count():int:this
          -2 (-20.00% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - TraceModuleFiles:get_Count():int:this
          -2 (-20.00% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - CopyStackSource:get_SampleIndexLimit():int:this
          -2 (-20.00% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - MutableTraceEventStackSource:get_SampleIndexLimit():int:this
          -4 (-20.00% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - TraceProcesses:get_Count():int:this (2 methods)
          -2 (-20.00% of base) : System.Console.dasm - ValueStringBuilder:get_Capacity():int:this
          -2 (-20.00% of base) : System.Data.SqlClient.dasm - ValueStringBuilder:get_Capacity():int:this
1462 total methods with Code Size differences (1450 improved, 12 regressed), 243018 unchanged.

@erozenfeld
Copy link
Member Author

Benchmark diffs:

PMI CodeSize Diffs for benchstones and benchmarks game in f:\runtime\artifacts\tests\coreclr\Windows_NT.x64.Release for  default jit
Summary of Code Size diffs:
(Lower is better)
Total bytes of diff: -2 (-0.00% of base)
    diff is an improvement.
Top file improvements (bytes):
          -2 : SIMD\SeekUnroll\SeekUnroll\SeekUnroll.dasm (-0.06% of base)
1 total files with Code Size differences (1 improved, 0 regressed), 81 unchanged.
Top method improvements (bytes):
          -2 (-0.47% of base) : SIMD\SeekUnroll\SeekUnroll\SeekUnroll.dasm - SeekUnroll:Main(ref):int
Top method improvements (percentages):
          -2 (-0.47% of base) : SIMD\SeekUnroll\SeekUnroll\SeekUnroll.dasm - SeekUnroll:Main(ref):int
1 total methods with Code Size differences (1 improved, 0 regressed), 1892 unchanged.

@erozenfeld
Copy link
Member Author

I measured throughput with pin on crossgen of System.Private.Corelib. It shows 0.02% regression, which is close to noise level of the measurements.

@erozenfeld
Copy link
Member Author

Framework regressions are caused by different CSE and/or register allocation decisions after the morph add optimization and/or null check removal. In some cases the number of instructions stays the same but the code size increases because, e.g., we generate lea rdx, bword ptr [rsi+488] instead of lea rdx, bword ptr [r13+88].

@erozenfeld
Copy link
Member Author

@dotnet/jit-contrib PTAL

@sandreenko sandreenko added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI optimization labels Jan 15, 2020
@erozenfeld
Copy link
Member Author

erozenfeld commented Jan 15, 2020

Example of an improvement diff:

-cmp      dword ptr [rsi], esi
-lea      rdx, bword ptr [rsi+216]
-add      rdx, 16
+lea      rdx, bword ptr [rsi+232]
mov      ebx, dword ptr [rdx]

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall.

There are a few places (box opts is one, there may be others) where we emit nullchecks as unconsumed indirs. I wonder if it Is worth looking for these -- or trying once more to fix up box opts to use GT_NULLCHECK.

@@ -465,7 +465,7 @@ struct BasicBlock : private LIR::Range

#define BBF_COMPACT_UPD \
(BBF_CHANGED | BBF_GC_SAFE_POINT | BBF_HAS_JMP | BBF_NEEDS_GCPOLL | BBF_HAS_IDX_LEN | BBF_BACKWARD_JUMP | \
BBF_HAS_NEWARRAY | BBF_HAS_NEWOBJ)
BBF_HAS_NEWARRAY | BBF_HAS_NEWOBJ | BBF_HAS_NULLCHECK)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious how you found these missing propagation bits... did you experiment with temporarily making optEarlyPropFor... always return true? (if not, may be worth a try).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I just saw some null checks not getting removed in some diffs and tracked that down to the missing flags. Good idea to try with optDoEarlyPropFor... returning always true. Will try tomorrow.

Copy link
Member Author

@erozenfeld erozenfeld Jan 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added missing OMF_HAS_NULLCHECK and BBF_HAS_NULLCHECK in a couple of places but they didn't result in any new diffs. Changing optDoEarlyPropFor... to always return true does result in diffs but they don't seem to be related to nullchecks. I will follow up outside of this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently it was me who was forgetting to add these....


GenTree* Compiler::optFindNullCheckToFold(GenTree* tree, LocalNumberToNullCheckTreeMap* nullCheckMap)
{
assert(tree->OperIsIndirOrArrLength());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if this would read better if it was converted to early return style.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change tomorrow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

{
// Check if we have the pattern above and find the nullcheck node if we do.
offsetValue += addr->gtGetOp2()->AsIntConCommon()->IconValue();
addr = addr->gtGetOp1();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe early exit here for large offsets?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change tomorrow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided not to do that: large offsets are rare and it's better to check in the end when we have the full offset.

//
// Return Value:
// True if GT_NULLCHECK can be folded into a node that is after tree is execution order,
// True if nullcheck may be folded into a node that is after tree is execution order,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo "tree is execution"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// Assignments to user locals are disallowed because they may be live in the handler.
result = (lhs->OperGet() == GT_LCL_VAR) && lvaTable[lhs->AsLclVarCommon()->GetLclNum()].lvIsTemp &&
((tree->gtGetOp2()->gtFlags & GTF_ASG) == 0);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we know which locals are live into handlers by this point?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I pushed a change to optCanMoveNullCheckPastTree that checks for lvLiveInOutOfHndlr and also fixes some issues with checking assignments. With this change I get additional improvements in frameworks with no new regressions:

Total bytes of diff: -7560 (-0.02% of base)
1538 total methods with Code Size differences (1526 improved, 12 regressed), 242942 unchanged.

and in benchmarks:

Total bytes of diff: -14 (-0.00% of base)
5 total methods with Code Size differences (5 improved, 0 regressed), 1888 unchanged.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since lvLiveInOutOfHndlr is debug-only, I switched to checking lvVolatileHint, which, despite the obscure name, checks exactly what's needed: vars live in handlers.

@VSadov
Copy link
Member

VSadov commented Jan 16, 2020

The following generates a redundant null check
(this is related to #1068):

private static object? JIT_ChkCastClassSpecial(void* toTypeHnd, object obj)
        {
            MethodTable* mt = RuntimeHelpers.GetMethodTable(obj);

Just curious will this change help with that? (I think it might, but not completely sure).

@erozenfeld
Copy link
Member Author

@VSadov can you paste the disassembly with the redundant null check you see in your example?

@VSadov
Copy link
Member

VSadov commented Jan 16, 2020

--- c:\TypeSystem\runtime\src\coreclr\src\System.Private.CoreLib\src\System\Runtime\CompilerServices\CastHelpers.cs 
            MethodTable* mt = RuntimeHelpers.GetMethodTable(obj);
00007FFF3036D660  cmp         dword ptr [rdx],edx  
00007FFF3036D662  mov         rax,qword ptr [rdx]  
            Debug.Assert(mt != toTypeHnd, "The check for the trivial cases should be inlined by the JIT");

            for (; ; )
            {
                mt = mt->BaseMethodTable;
00007FFF3036D665  mov         rax,qword ptr [rax+10h]  
                if (mt == toTypeHnd)
00007FFF3036D669  cmp         rax,rcx  
00007FFF3036D66C  je          00007FFF3036D6A2  
                    goto done;

@erozenfeld
Copy link
Member Author

Yes, I verified that the nullcheck in RuntimeHelpers:GetMethodTable(Object) is eliminated with these changes so it should be eliminated in the caller that inlines it as well.

@erozenfeld
Copy link
Member Author

Looks good overall.

There are a few places (box opts is one, there may be others) where we emit nullchecks as unconsumed indirs. I wonder if it Is worth looking for these -- or trying once more to fix up box opts to use GT_NULLCHECK.

I tried changing indirs to nullchecks in box opts and got some diffs that include regressions. There is some special logic that applies to indirs but doesn't apply to nullchecks. I will follow up outside of this PR.

@erozenfeld
Copy link
Member Author

@AndyAyersMS I pushed several commits that address your feedback. PTAL.

Current framework diff numbers:

PMI CodeSize Diffs for System.Private.CoreLib.dll, framework assemblies for  default jit
Summary of Code Size diffs:
(Lower is better)
Total bytes of diff: -7560 (-0.02% of base)
    diff is an improvement.
Top file regressions (bytes):
           1 : System.Net.Mail.dasm (0.00% of base)
Top file improvements (bytes):
       -1313 : System.Text.RegularExpressions.dasm (-0.50% of base)
       -1280 : System.Private.CoreLib.dasm (-0.03% of base)
        -464 : System.Text.Json.dasm (-0.11% of base)
        -369 : System.Private.Xml.dasm (-0.01% of base)
        -326 : System.Reflection.Metadata.dasm (-0.08% of base)
        -320 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (-0.01% of base)
        -270 : Microsoft.CodeAnalysis.dasm (-0.02% of base)
        -204 : System.Reflection.MetadataLoadContext.dasm (-0.11% of base)
        -199 : System.Threading.Tasks.Dataflow.dasm (-0.02% of base)
        -146 : System.Net.Quic.dasm (-0.27% of base)
        -144 : System.Security.Cryptography.Pkcs.dasm (-0.04% of base)
        -141 : Microsoft.CodeAnalysis.CSharp.dasm (-0.00% of base)
        -106 : System.Collections.dasm (-0.02% of base)
         -88 : System.Linq.dasm (-0.01% of base)
         -81 : System.Linq.Parallel.dasm (-0.00% of base)
         -80 : System.Numerics.Tensors.dasm (-0.03% of base)
         -74 : System.Net.Security.dasm (-0.05% of base)
         -72 : System.Memory.dasm (-0.03% of base)
         -72 : System.Transactions.Local.dasm (-0.06% of base)
         -66 : System.Security.Cryptography.Cng.dasm (-0.04% of base)
125 total files with Code Size differences (124 improved, 1 regressed), 40 unchanged.
Top method regressions (bytes):
          18 ( 0.79% of base) : System.Data.SqlClient.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          18 ( 0.78% of base) : System.Net.Http.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          18 ( 0.78% of base) : System.Net.HttpListener.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          18 ( 0.78% of base) : System.Net.Mail.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          18 ( 0.78% of base) : System.Net.Security.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          12 ( 0.36% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ParseAttributeValueSlow(int,ushort,NodeData):this
           8 ( 3.45% of base) : System.Private.CoreLib.dasm - Object:MemberwiseClone():Object:this (2 methods)
           7 ( 1.02% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:HandleEntityReference(bool,int,byref):int:this
           5 ( 0.69% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ParseDoctypeDecl():bool:this
           3 ( 0.21% of base) : System.Net.Http.dasm - <ReadAsync>d__2:MoveNext():this
           3 ( 0.21% of base) : System.Net.Http.dasm - <ReadAsync>d__6:MoveNext():this
           1 ( 0.55% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ParseEntityReference():this
Top method improvements (bytes):
       -1011 (-6.18% of base) : System.Text.RegularExpressions.dasm - RegexWriter:EmitFragment(int,RegexNode,int):this (3 methods)
        -262 (-5.38% of base) : System.Private.CoreLib.dasm - ConcurrentQueue`1:get_Count():int:this (7 methods)
        -215 (-3.13% of base) : System.Private.CoreLib.dasm - Number:NumberToStringFormat(byref,byref,ReadOnlySpan`1,NumberFormatInfo)
        -121 (-3.27% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ParseXmlDeclaration(bool):bool:this
         -97 (-5.13% of base) : System.Private.CoreLib.dasm - ConcurrentQueueSegment`1:TryDequeue(byref):bool:this (7 methods)
         -91 (-5.62% of base) : System.Private.CoreLib.dasm - ConcurrentQueueSegment`1:TryPeek(byref,bool):bool:this (7 methods)
         -84 (-2.05% of base) : System.Text.RegularExpressions.dasm - RegexWriter:RegexCodeFromRegexTree(RegexTree):RegexCode:this (3 methods)
         -83 (-2.96% of base) : System.Threading.Tasks.Dataflow.dasm - JoinBlockTarget`1:ConsumeReservedMessage():bool:this (7 methods)
         -56 (-3.33% of base) : System.Private.CoreLib.dasm - ConcurrentQueue`1:SnapForObservation(byref,byref,byref,byref):this (7 methods)
         -53 (-3.61% of base) : System.Net.Security.dasm - NegotiateStreamPal:Encrypt(SafeDeleteContext,ref,int,int,bool,bool,byref,int):int
         -51 (-2.68% of base) : System.Reflection.MetadataLoadContext.dasm - EcmaAssembly:ComputeAssemblyReferences():ref:this
         -48 (-0.38% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - RegisteredTraceEventParser:GetManifestForRegisteredProvider(Guid):String
         -42 (-3.75% of base) : System.Net.Quic.dasm - ResettableCompletionSource`1:System.Threading.Tasks.Sources.IValueTaskSource.GetResult(short):this (7 methods)
         -42 (-3.87% of base) : System.Private.CoreLib.dasm - ConcurrentQueue`1:GetCount(ConcurrentQueueSegment`1,int,ConcurrentQueueSegment`1,int):long (7 methods)
         -42 (-0.93% of base) : System.Private.CoreLib.dasm - <Enumerate>d__26:MoveNext():bool:this (7 methods)
         -39 (-4.64% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ParseEndElement():this
         -36 (-0.38% of base) : System.Collections.dasm - Enumerator:System.Collections.IEnumerator.get_Current():Object:this (63 methods)
         -36 (-7.45% of base) : System.Text.RegularExpressions.dasm - RegexWriter:Emit(int,int):this (3 methods)
         -34 (-2.64% of base) : Microsoft.CodeAnalysis.dasm - MetadataReaderExtensions:GetReferencedAssembliesOrThrow(MetadataReader):ImmutableArray`1
         -29 (-1.83% of base) : System.Private.CoreLib.dasm - Guid:TryParseExactX(ReadOnlySpan`1,byref):bool
Top method regressions (percentages):
           8 ( 3.45% of base) : System.Private.CoreLib.dasm - Object:MemberwiseClone():Object:this (2 methods)
           7 ( 1.02% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:HandleEntityReference(bool,int,byref):int:this
          18 ( 0.79% of base) : System.Data.SqlClient.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          18 ( 0.78% of base) : System.Net.Http.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          18 ( 0.78% of base) : System.Net.HttpListener.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          18 ( 0.78% of base) : System.Net.Mail.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
          18 ( 0.78% of base) : System.Net.Security.dasm - SafeDeleteContext:AcceptSecurityContext(byref,byref,int,int,ReadOnlySpan`1,byref,byref):int
           5 ( 0.69% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ParseDoctypeDecl():bool:this
           1 ( 0.55% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ParseEntityReference():this
          12 ( 0.36% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ParseAttributeValueSlow(int,ushort,NodeData):this
           3 ( 0.21% of base) : System.Net.Http.dasm - <ReadAsync>d__2:MoveNext():this
           3 ( 0.21% of base) : System.Net.Http.dasm - <ReadAsync>d__6:MoveNext():this
Top method improvements (percentages):
          -6 (-37.50% of base) : System.Transactions.Local.dasm - EnlistableStates:CompleteAbortingClone(InternalTransaction):this
          -6 (-37.50% of base) : System.Transactions.Local.dasm - EnlistableStates:CreateBlockingClone(InternalTransaction):this
          -6 (-37.50% of base) : System.Transactions.Local.dasm - EnlistableStates:CreateAbortingClone(InternalTransaction):this
          -6 (-37.50% of base) : System.Transactions.Local.dasm - TransactionStatePromotedNonMSDTCBase:CompleteAbortingClone(InternalTransaction):this
          -6 (-37.50% of base) : System.Transactions.Local.dasm - TransactionStatePromotedNonMSDTCBase:CreateBlockingClone(InternalTransaction):this
          -6 (-37.50% of base) : System.Transactions.Local.dasm - TransactionStatePromotedNonMSDTCBase:CreateAbortingClone(InternalTransaction):this
          -2 (-33.33% of base) : System.Private.CoreLib.dasm - RuntimeHelpers:GetMethodTable(Object):long
          -6 (-25.00% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:OnNewLine(int):this
          -2 (-22.22% of base) : System.Private.CoreLib.dasm - RuntimeHelpers:GetElementSize(Array):ushort
          -2 (-20.00% of base) : Microsoft.CodeAnalysis.dasm - ILBuilder:get_InstructionsEmitted():int:this
          -2 (-20.00% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - TraceThreads:get_Count():int:this
          -2 (-20.00% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - TraceCallStacks:get_Count():int:this
          -2 (-20.00% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - TraceCodeAddresses:get_Count():int:this
          -2 (-20.00% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - TraceMethods:get_Count():int:this
          -2 (-20.00% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - TraceModuleFiles:get_Count():int:this
          -2 (-20.00% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - CopyStackSource:get_SampleIndexLimit():int:this
          -2 (-20.00% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - MutableTraceEventStackSource:get_SampleIndexLimit():int:this
          -4 (-20.00% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - TraceProcesses:get_Count():int:this (2 methods)
          -2 (-20.00% of base) : System.Console.dasm - ValueStringBuilder:get_Capacity():int:this
          -2 (-20.00% of base) : System.Data.SqlClient.dasm - ValueStringBuilder:get_Capacity():int:this
1538 total methods with Code Size differences (1526 improved, 12 regressed), 242942 unchanged.

Benchmark diffs:

PMI CodeSize Diffs for benchstones and benchmarks game in f:\runtime\artifacts\tests\coreclr\Windows_NT.x64.Release for  default jit
Summary of Code Size diffs:
(Lower is better)
Total bytes of diff: -14 (-0.00% of base)
    diff is an improvement.
Top file improvements (bytes):
          -4 : BenchmarksGame\binarytrees\binarytrees-2\binarytrees-2.dasm (-0.33% of base)
          -4 : BenchmarksGame\binarytrees\binarytrees-5\binarytrees-5.dasm (-0.14% of base)
          -4 : Benchstones\BenchI\NDhrystone\NDhrystone\NDhrystone.dasm (-0.12% of base)
          -2 : SIMD\SeekUnroll\SeekUnroll\SeekUnroll.dasm (-0.06% of base)
4 total files with Code Size differences (4 improved, 0 regressed), 78 unchanged.
Top method improvements (bytes):
          -4 (-2.67% of base) : BenchmarksGame\binarytrees\binarytrees-2\binarytrees-2.dasm - TreeNode:itemCheck():int:this
          -4 (-2.67% of base) : BenchmarksGame\binarytrees\binarytrees-5\binarytrees-5.dasm - TreeNode:CountNodes():int:this
          -2 (-0.47% of base) : SIMD\SeekUnroll\SeekUnroll\SeekUnroll.dasm - SeekUnroll:Main(ref):int
          -2 (-0.78% of base) : Benchstones\BenchI\NDhrystone\NDhrystone\NDhrystone.dasm - NDhrystone:Proc1(byref)
          -2 (-1.69% of base) : Benchstones\BenchI\NDhrystone\NDhrystone\NDhrystone.dasm - NDhrystone:Proc3(byref)
Top method improvements (percentages):
          -4 (-2.67% of base) : BenchmarksGame\binarytrees\binarytrees-2\binarytrees-2.dasm - TreeNode:itemCheck():int:this
          -4 (-2.67% of base) : BenchmarksGame\binarytrees\binarytrees-5\binarytrees-5.dasm - TreeNode:CountNodes():int:this
          -2 (-1.69% of base) : Benchstones\BenchI\NDhrystone\NDhrystone\NDhrystone.dasm - NDhrystone:Proc3(byref)
          -2 (-0.78% of base) : Benchstones\BenchI\NDhrystone\NDhrystone\NDhrystone.dasm - NDhrystone:Proc1(byref)
          -2 (-0.47% of base) : SIMD\SeekUnroll\SeekUnroll\SeekUnroll.dasm - SeekUnroll:Main(ref):int
5 total methods with Code Size differences (5 improved, 0 regressed), 1888 unchanged.

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. This looks good.

@@ -465,7 +465,7 @@ struct BasicBlock : private LIR::Range

#define BBF_COMPACT_UPD \
(BBF_CHANGED | BBF_GC_SAFE_POINT | BBF_HAS_JMP | BBF_NEEDS_GCPOLL | BBF_HAS_IDX_LEN | BBF_BACKWARD_JUMP | \
BBF_HAS_NEWARRAY | BBF_HAS_NEWOBJ)
BBF_HAS_NEWARRAY | BBF_HAS_NEWOBJ | BBF_HAS_NULLCHECK)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently it was me who was forgetting to add these....

@erozenfeld erozenfeld force-pushed the NullCheckFolding branch 2 times, most recently from d7cf712 to 3f5b5f7 Compare January 17, 2020 02:12
@erozenfeld
Copy link
Member Author

#1741 added another place in importer where we create a GT_NULLCHECK node. I pushed a change to set OMF_HAS_NULLCHECK and BBF_HAS_NULLCHECK there. No new diffs.

optFoldNullChecks attempts to remove GT_NULLCHECK nodes that are
post-dominated by indirections on the same variable. These changes
implement a number of improvements.

1. Recognize more patterns.
Before these changes only the following pattern was recognized:
x = comma(nullcheck(y), add(y, const1))

followed by

indir(add(x, const2))

where const1 + const2 is sufficiently small.

With these changes the following patterns are recognized:

nullcheck(x)
or
x = comma(nullcheck(y), add(y, const1))

followed by

indir(x)
or
indir(add(x, const2))

where const1 + const2 is sufficiently small.

2. Indirections now include GT_ARR_LENGTH nodes.

3. Morph has an optimization
((x+icon1)+icon2) => (x+(icon1+icon2))
These changes generalize it to handle commas:
((comma(y, x+icon1)+icon2) => comma(y, x+(icon1+icon2))

That exposes more trees to null check folding.

4. Fix a bug in flow transformations that could lose BBF_HAS_NULLCHECK flag
on some basic blocks, which led to missing opportunities for null check folding.

5. Make safety checks in optCanMoveNullCheckPastTree
(for trees between the nullcheck and the indirection) both more correct
and less conservative. For example, we were not allowing any assignments
if we were inside try; however, assignments to compiler temps are safe since
they won't be visible in handlers.

5. Increase the maximum number of trees we check between GT_NULLCHECK and
the indirection from 25 to 50.

7. Refactor the code and move pattern recognition and safety checks to
helper methods.

8. Add missing BBF_HAS_NULLCHECK and OMF_HAS_NULLCHECK when we create GT_NULLCHECK nodes.
nullCheckTree->gtFlags |= GTF_ORDER_SIDEEFF;
nullCheckTree->gtFlags |= GTF_IND_NONFAULTING;

if (nullCheckParent != nullptr)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a change that added this if to deal with cases when the GT_NULLCHECK is the root node of the statement.

@erozenfeld
Copy link
Member Author

The only failing tests are eventpipe tests that are failing in other PRs as well. #1794 is attempting to fix them. The crossgen-comparison arm job was cancelled after an hour of waiting.

@erozenfeld erozenfeld merged commit c44526d into dotnet:master Jan 17, 2020
@VSadov
Copy link
Member

VSadov commented Jan 17, 2020

After this change I see the following:

--- c:\TypeSystem\runtime\src\coreclr\src\System.Private.CoreLib\src\System\Runtime\CompilerServices\CastHelpers.cs 
            MethodTable* mt = RuntimeHelpers.GetMethodTable(obj);
00007FFE6EDC72E0  mov         rax,qword ptr [rdx]  
            Debug.Assert(mt != toTypeHnd, "The check for the trivial cases should be inlined by the JIT");

            for (; ; )
            {
                mt = mt->BaseMethodTable;
00007FFE6EDC72E3  mov         rax,qword ptr [rax+10h]  
                if (mt == toTypeHnd)
00007FFE6EDC72E7  cmp         rax,rcx  
00007FFE6EDC72EA  je          00007FFE6EDC7320  
                    goto done;

The redundant null check is gone. Thanks!!

sdmaclea added a commit to sdmaclea/runtime that referenced this pull request Jan 29, 2020
Thies file was added accidentally as part of dotnet#1735
@sdmaclea sdmaclea mentioned this pull request Jan 29, 2020
sdmaclea added a commit that referenced this pull request Jan 29, 2020
Thies file was added accidentally as part of #1735
@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI optimization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants