Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Fixes for tracking struct field sequences #23932

Merged
merged 3 commits into from
Apr 23, 2019

Conversation

briansull
Copy link

@briansull briansull commented Apr 12, 2019

Fixes for Zero Offset field sequence tracking

  • A GT_LCL_VAR may have a zeroOffset field
  • Add an assert to prevent building field sequences with duplicates
  • Fix fgMorphField when we have a zero offset field

Improve fgAddFieldSeqForZeroOffset

  • Add JItDump info
  • Handle GT_LCL_FLD

@briansull
Copy link
Author

@dotnet-build-bot Test Ubuntu arm Cross Checked crossgen_comparison Build and Test
@dotnet-build-bot Test Ubuntu arm Cross Release crossgen_comparison Build and Test
@dotnet-build-bot Test Windows_NT arm Cross Checked Innerloop Build and Test
@dotnet-build-bot Test Windows_NT arm64 Cross Checked Innerloop Build and Test

@briansull
Copy link
Author

/azp run coreclr-outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@briansull
Copy link
Author

@dotnet-build-bot Test Windows_NT arm64 Cross Checked Innerloop Build and Test

@briansull
Copy link
Author

@dotnet/jit-contrib PTAL

@briansull
Copy link
Author

@CarolEidt
Copy link

@briansull - can you characterize the kind of code that led to this issue? It seems counter-intuitive that a GT_LCL_VAR would have an associated offset fieldSeq even a zero-offset.

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would appreciate a bit more background here.

  • Was this a regression?
  • If so, do we know what caused it?
  • Does it cause bad codegen or just asserts?
  • Does this cause diffs on Core?
  • If no diff, can we get test cases or describe the situations where we get asserts?

src/jit/morph.cpp Show resolved Hide resolved
src/jit/morph.cpp Show resolved Hide resolved
src/jit/morph.cpp Show resolved Hide resolved
@briansull
Copy link
Author

briansull commented Apr 16, 2019

This sequence requires a zero-field annotation on a GT_LCL_VAR during morph before it can reach the final GT_LCL_FLD with a two field sequence:

fgMorphTree BB14, stmt 32 (before)
               [001076] ----G-------              /--*  FIELD     long   m_constArray
               [001072] ----G-------              |  \--*  ADDR      byref 
               [001073] ----G-------              |     \--*  FIELD     struct blob
               [001074] ------------              |        \--*  ADDR      byref 
               [001075] ------------              |           \--*  LCL_VAR   struct V18 loc11        
               [000152] -AC---------              *  ASG       long  
               [000151] D------N----              \--*  LCL_VAR   long  (AX) V24 loc17        


fgMorphTree BB14, stmt 32 (after)
               [001075] -----+------              /--*  LCL_FLD   long   V18 loc11        [+8] Fseq[blob, m_constArray]
               [000152] -A--G+------              *  ASG       long  
               [000151] D---G+-N----              \--*  LCL_VAR   long  (AX) V24 loc17        

@CarolEidt
Copy link

This sequence requires a zero-field annotation on a GT_LCL_VAR during morph

Where is the zero-field annotation (on which node)? And what IL sequence (or prior transform) causes it?

@briansull
Copy link
Author

@AndyAyersMS
With ValueNumbering, I believe that there are potential correctness issue when we drop a fieldSequence annotation. I have fixed several issue in this area for 3.0 (one example is #21512). These are correctness issues because we are tracking memory stores into struct fields. I have improved the JitDumps for this area (see #23876) so that we can now always see when a node has a zero-field annotation. With this change I also added an important new assert that prevents us from incorrectly adding a duplicate field sequence: (like struct.Fld1.Fld1) There were two places in the current code that were doing that.

Since I know that there are potential issues I want to fix them for 3.0, I also saw this in my work on the desktop bug 837706 the PMI run with JItStress=2 and JItStressReg where we were hitting an assert.

I have posted the diffs that I got on Core (see comment #3 above). Almost all were improvements,

@briansull
Copy link
Author

briansull commented Apr 16, 2019

@CarolEidt
We first morph the subtree:
(the field blob is at a zero offset, so we don't create an add node)

               [001073] ----G-------              |     \--*  FIELD     struct blob
               [001074] ------------              |        \--*  ADDR      byref 
               [001075] ------------              |           \--*  LCL_VAR   struct V18 loc11        

into
[001075] ------------ | \--* LCL_VAR struct V18 loc11 Fseq[blob]

@briansull
Copy link
Author

briansull commented Apr 17, 2019

The tree was produced from this IL during inlining:

Expanding INLINE_CANDIDATE in statement [000149] in BB14:
               [000149] ------------              *  STMT      void  (IL 0x0B1...0x0BD)
               [000148] I-C-G-------              \--*  CALL      long   System.Reflection.ConstArray.get_Signature (exactContextHnd=0x00007FF7FF58BDA9)
               [000147] ------------ this in rcx     \--*  ADDR      byref 
               [000146] ------------                    \--*  FIELD     struct blob
               [000145] ------------                       \--*  ADDR      byref 
               [000144] ------------                          \--*  LCL_VAR   struct V18 loc11  

*************** In impImport() for System.Reflection.ConstArray:get_Signature():long:this

impImportBlockPending for BB70

Importing BB70 (PC=000) of 'System.Reflection.ConstArray:get_Signature():long:this'
    [ 0]   0 (0x000) ldarg.0
    [ 1]   1 (0x001) ldfld 04001C85
    [ 1]   6 (0x006) ret

    Inlinee Return expression (before normalization)  =>
               [001076] ----G-------              *  FIELD     long   m_constArray
               [001072] ----G-------              \--*  ADDR      byref 
               [001073] ----G-------                 \--*  FIELD     struct blob
               [001074] ------------                    \--*  ADDR      byref 
               [001075] ------------                       \--*  LCL_VAR   struct V18 loc11        

@AndyAyersMS
Copy link
Member

I believe that there are potential correctness issue when we drop a fieldSequence

I am willing to believe this too, but it would be good to understand exactly how it leads to bugs. Presumably the diffs you are showing above are from better codegen, not from fixing buggy codegen?

Does the desktop assert case lead to bad codegen in release?

How do we know we're still not missing cases? The field sequence and zero offset field map maintenance seems easy to overlook.

Finally, is this related in any way to #22900?

@briansull
Copy link
Author

Original AsmDiffs were incorrect.

Here are the actual AsmDiff for this change:

Summary:
(Lower is better)
Total bytes of diff: -48 (0.00% of base)
    diff is an improvement.
Top file regressions by size (bytes):
           4 : System.Private.Xml.dasm (0.00% of base)
           3 : System.Private.DataContractSerialization.dasm (0.00% of base)
Top file improvements by size (bytes):
         -55 : System.Data.Common.dasm (0.00% of base)
3 total files with size differences (1 improved, 2 regressed), 126 unchanged.
Top method regressions by size (bytes):
           4 ( 0.37% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ReadData():int:this
           3 ( 0.22% of base) : System.Private.DataContractSerialization.dasm - XmlCanonicalWriter:WriteStartAttribute(ref,int,int,ref,int,int):this
Top method improvements by size (bytes):
         -11 (-3.51% of base) : System.Data.Common.dasm - RBTree`1:GetNewNode(ubyte):int:this
         -11 (-3.51% of base) : System.Data.Common.dasm - RBTree`1:GetNewNode(short):int:this
         -11 (-3.53% of base) : System.Data.Common.dasm - RBTree`1:GetNewNode(int):int:this
         -11 (-3.38% of base) : System.Data.Common.dasm - RBTree`1:GetNewNode(double):int:this
         -11 (-3.50% of base) : System.Data.Common.dasm - RBTree`1:GetNewNode(long):int:this
Top method regressions by size (percentage):
           4 ( 0.37% of base) : System.Private.Xml.dasm - XmlTextReaderImpl:ReadData():int:this
           3 ( 0.22% of base) : System.Private.DataContractSerialization.dasm - XmlCanonicalWriter:WriteStartAttribute(ref,int,int,ref,int,int):this
Top method improvements by size (percentage):
         -11 (-3.53% of base) : System.Data.Common.dasm - RBTree`1:GetNewNode(int):int:this
         -11 (-3.51% of base) : System.Data.Common.dasm - RBTree`1:GetNewNode(ubyte):int:this
         -11 (-3.51% of base) : System.Data.Common.dasm - RBTree`1:GetNewNode(short):int:this
         -11 (-3.50% of base) : System.Data.Common.dasm - RBTree`1:GetNewNode(long):int:this
         -11 (-3.38% of base) : System.Data.Common.dasm - RBTree`1:GetNewNode(double):int:this
7 total methods with size differences (5 improved, 2 regressed), 185969 unchanged.
Completed analysis in 23.29s

@briansull
Copy link
Author

@dotnet-build-bot Test Windows_NT arm Cross Checked Innerloop Build and Test
@dotnet-build-bot Test Windows_NT arm64 Cross Checked Innerloop Build and Test

@briansull
Copy link
Author

@dotnet/jit-contrib PTAL

@CarolEidt
Copy link

@briansull - could you respond to the questions that @AndyAyersMS has asked?

@briansull
Copy link
Author

@AndyAyersMS

I am willing to believe this too, but it would be good to understand exactly how it leads to bugs.

Yes, This is an change that improves our ability to track what is happening when we record the zero offset fields.. It adds an important assert so that we don't create incorrect duplicate field sequences (which we are currently doing)

Presumably the diffs you are showing above are from better codegen, not from fixing buggy codegen?

My original set of diffs were against an incorrect baseline. There are now very few actual diffs for this change. The few diffs that we do see involve making CSE's using struct address calculations with zero offset fields.

Does the desktop assert case lead to bad codegen in release?

It resulted in an assert with the checked compiler.

How do we know we're still not missing cases? The field sequence and zero offset field map maintenance seems easy to overlook.

Yes, I agree and that is the primary purpose of this chnage. With this change we now print out in the JitDump every modification involving the Zero Offset tracking. Making it much easier to follow what is happening here.

Finally, is this related in any way to #22900?

No, This isn't directly related to that issue, although this change will made debugging that issue and other similar issues much easier going forwards.

How do we know we're still not missing cases?

I believe that with this refactoring we will not be dropping the Zero Field sequence information.
In particular with this last round of changes I made all of the calls to GetZeroOffsetFieldMap()->Set() use the method fgAddFieldSeqForZeroOffset() which insures that we behave consistently.

Copy link

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still find this a bit confusing in a couple of places.

src/jit/morph.cpp Show resolved Hide resolved
src/jit/morph.cpp Outdated Show resolved Hide resolved
{
temp->ChangeOper(GT_LCL_FLD); // Note that this makes the gtFieldSeq "NotAField"...
assert(temp->OperGet() == GT_LCL_VAR);
temp->ChangeOper(GT_LCL_FLD); // Note that this typically makes the gtFieldSeq "NotAField"...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what case would this not be NotAField - is this the zero-offset case?

Copy link
Author

@briansull briansull Apr 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the code that I added with this change will set this: (see the compiler.hpp diff)

                // Set the zeroFieldSeq in the GT_LCL_FLD node
                gtLclFld.gtFieldSeq = zeroFieldSeq;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding that, e.g.

// Note that this typically makes the gtFieldSeq "NotAField", unless we have a zero-offset FieldSeq

or something like that.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it would be good to explain why it sometimes doesn't make it NotAField

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I will add a comment explaining that case

#ifdef DEBUG
if (verbose)
{
printf("\nBefore calling fgAddFieldSeqForZeroOffset:\n");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems redundant because fgAddFieldSeqForZeroOffset also does a "before" dump.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually This will print the whole tree (and it occurs after the call to fgMorphSmpOp)
It is helpful to see the final version of the tree to verify that the field sequences are correct and non of them went missing during the call to fgMorphSmpOp

Copy link
Author

@briansull briansull Apr 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want, I could change this to always print the final result of Compiler::fgMorphField.
That might be better, since the morphing of a GT_FIELD produces a new more complex tree.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Carol after my last change the assert can now be changed to simply check that the type is BYREF or I_IMPL.

// We expect 'addr' to be an address at this point.
assert(addr->TypeGet() == TYP_BYREF || addr->TypeGet() == TYP_I_IMPL);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great!

 - A GT_LCL_VAR may have a zeroOffset field
 - Add an assert to prevent building field sequences with duplicates
 - Fix fgMorphField when we have a zero offset field
Improve fgAddFieldSeqForZeroOffset
 - Add JItDump info
 - Handle GT_LCL_FLD

Changing the sign of an int constant also remove any field sequence information.

Added method header comment for fgAddFieldSeqForZeroOffset

Changed when we call fgAddFieldSeqForZeroOffset to be before the call to fgMorphSmpOp.

Prefer calling fgAddFieldSeqForZeroOffset() to GetZeroOffsetFieldMap()->Set()
@briansull
Copy link
Author

@dotnet-build-bot Test Windows_NT arm Cross Checked Innerloop Build and Test
@dotnet-build-bot Test Windows_NT arm64 Cross Checked Innerloop Build and Test

@briansull
Copy link
Author

ping @dotnet/jit-contrib PTAL

Copy link

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks for the extra comment

temp->ChangeOper(GT_LCL_FLD); // Note that this typically makes the gtFieldSeq "NotAField"...
temp->AsLclFld()->gtLclOffs = (unsigned short)ival1;
temp->ChangeOper(GT_LCL_FLD); // Note that this typically makes the gtFieldSeq "NotAField",
// unless there is a zero filed offset associated with 'temp'.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formatting here is weird, but it's not a big deal I guess.

@briansull briansull merged commit e7ecfec into dotnet:master Apr 23, 2019
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Fixes for tracking struct field sequences

Commit migrated from dotnet/coreclr@e7ecfec
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants