-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix regression test 46239 with Crossgen2 and improve runtime logging #51416
Conversation
@@ -304,17 +305,19 @@ protected virtual void FinalizeRuntimeSpecificStaticFieldLayout(TypeSystemContex | |||
{ | |||
} | |||
|
|||
protected static ComputedInstanceFieldLayout ComputeExplicitFieldLayout(MetadataType type, int numInstanceFields) | |||
protected virtual bool IsBlittableOrManagedSequential(TypeDesc type) => false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect the default here should be true. Blittable types have a requirement that they are handled in the same way on all runtimes as they describe native data structures, and by defaulting to false, this leads for a way for the algorithm to misbehave on Native AOT.
I'd prefer to see this either be default true, or make it an abstract method and force the NativeAOT compiler to deal with this when it gets there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MichalStrehovsky could you comment here on what you'd like to see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe let's just move IsManagedSequentialType
to here and then this doesn't even need to be virtual and can just do the same thing?
We didn't have trouble with this in .NET Native (where this is missing) so I don't think it matters, but we ported over so many CoreCLR weirdness's to this algorithm over time that I no longer understand how layout is done and would prefer all bugs in it to be also surfaced in crossgen2 (and not have NativeAOT specific bugs that we need to troubleshoot there).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MichalStrehovsky - Thanks for your feedback. I'm trying to modify the change based on your suggestion but I'm hitting a layering problem - IsBlittable is part of MarshalUtils and that's not in ILCompiler.TypeSystem.ReadyToRun and it seems to me that moving it over brings various bits of interop code into ILCompiler.TypeSystem.ReadyToRun which I suspect to be undesirable as my current understanding is that NativeAOT doesn't use exactly identical interop as Crossgen2 / CoreCLR. What do you think would be the ideal way to resolve this discrepancy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MarshalUtils are part of ILCompiler.TypeSystem on the NativeAOT side, so I'm not opposed to moving it (and things it depends on) into ILCompiler.TypeSystem.ReadyToRun as well. The split is rather arbitrary and interop is getting tangled into the type system the same way as is on CoreCLR. It was a nice dream to have a separation of concerns there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The split is rather arbitrary and interop is getting tangled into the type system the same way as is on CoreCLR. It was a nice dream to have a separation of concerns there.
Would there be a way to adjust the runtime side of the things to avoid this mess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, for now I think I have at least managed to make something like an inventory of the various runtime behaviors by implementing them in Crossgen2. In theory we may be able to simplify some of the stuff, I however suspect that some of the subtle distinctions amount to tiny memory use optimizations on the runtime side i.e. something that's very tricky to undo as it may incur working set regressions in arbitrary scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
however suspect that some of the subtle distinctions amount to tiny memory use optimizations on the runtime side
I doubt that it would show up on the radar. The behavior of the field layout algorithm is completely accidental in number of cases, and in fact there are many known situations where it is inefficient for no good reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're probably the biggest expert in this field so I have no desire to doubt your assessment. My only point is that further independent development of runtime vs. Crossgen2 side of this equation is giving me goosebumps; my current expectation is that we'll ultimately go down the path you yourself suggested in the past - having a way to communicate the field layouts from the compiler to the runtime, that will give us room for experimentation with potential packing optimizations and rid us of the burdensome need to keep both codebases in 100% sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than my comment on the IsBlittableOrManagedSequential, I'm pretty happy with this.
It looks like this is causing every jitstress job flavor to fail: |
Any chance you'll get this in soon? there are a lot of test jobs failing due to test49826. |
@@ -2257,7 +2249,7 @@ private bool NeedsTypeLayoutCheck(TypeDesc type) | |||
|
|||
private bool HasLayoutMetadata(TypeDesc type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The one place where this method is used looks like dead/unrechable code.
This method can only return true when type.IsValueType
is true, but this case is handled in EncodeFieldBaseOffset
earlier via the else if (pMT.IsValueType)
check. It means that this method won't ever return true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice cleanup, thanks Jan for pointing that out, deleted in 9th commit.
{ | ||
PreventRecursiveFieldInlinesOutsideVersionBubble(field, callerMethod); | ||
|
||
// We won't try to be smart for classes with layout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the old crossgen has a special case for this with this comment that kicks in for non-value types (different from what was here before that kicked in for value types).
Do we want to match it? Or are we confident that we have extensive test coverage for all non-value types with layout corner cases?
Also, once we add this back, can at least some of the remaining changes be simplified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to my understanding of the CoreCLR code, the layout check is based on the following condition:
runtime/src/coreclr/vm/methodtablebuilder.cpp
Line 11924 in 207b03a
BOOL fHasLayout = |
It seems to me that this means that the conditional statement you originally pointed out should basically stay, the problematic bit is the method HasLayoutMetadata itself - we're certainly trying to match the CoreCLR runtime behavior, in this particular case we probably just missed the fact that the CoreRT version of HasLayoutMetadata is quite different than its CoreCLR runtime counterpart method
runtime/src/coreclr/vm/methodtablebuilder.cpp
Line 11746 in 207b03a
BOOL HasLayoutMetadata(Assembly* pAssembly, IMDInternalImport* pInternalImport, mdTypeDef cl, MethodTable* pParentMT, BYTE* pPackingSize, BYTE* pNLTType, BOOL* pfExplicitOffsets) |
I'll look into putting HasLayoutMetadata back and modifying it based on the CoreCLR runtime version. Other than that, can you please be more specific regarding the simplifications you're suggesting? While I do agree that some of the runtime logic is complicated and somewhat arbitrary, I'm worried that unsettling it at this point might cause more harm than good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the equivalent of the HasLayoutMetadata check should be just type.IsSequentialLayout || type.IsExplicitLayout
.
Yes, the runtime logic is complicated. We have done multiple rounds of trying to replicate it, and we are still finding mismatches. It is safe bet that this is not the last round of fixes. I think we need to make it simpler to get confident that it matches. I will do some ad-hoc testing to try to find more situations where it does not match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Jan for your additional feedback. I have updated the runtime check based on your suggestion. Once the change passes basic testing, I'll also trigger another round of Pri1 testing that previously uncovered most of these inconsistencies. It would be certainly awesome if you were willing to use your vast expertise in this area to devise additional tests exercising various obscure corner cases we may have previously missed.
@@ -857,7 +915,9 @@ private static SizeAndAlignment ComputeInstanceSize(MetadataType type, LayoutInt | |||
{ | |||
if (type.IsValueType) | |||
{ | |||
instanceSize = LayoutInt.AlignUp(instanceSize, alignment, target); | |||
instanceSize = LayoutInt.AlignUp(instanceSize, | |||
alignUpInstanceByteSize ? alignment : LayoutInt.Min(alignment, target.LayoutPointerSize), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that the alignUpInstanceByteSize = false
case will kick in for a struct like this on ARM32:
[StructLayout(LayoutKind.Explicit)]
internal struct S3
{
[FieldOffset(0)]
public ulong tmp1;
[FieldOffset(8)]
public Object tmp2;
}
The algorithm will compute the instanceSize of this struct as 12. Is that correct?
It looks like a bug in the type loader. The instance size of this struct on ARM should be 16, so that the long field is aligned at 8 bytes, so that potential atomic 64-bit operations work fine on it. It would be better to fix the type loader instead of replicating the bug here.
} | ||
|
||
return false; | ||
return MetadataFieldLayoutAlgorithm.IsBlittableOrManagedSequentialType(type) || (type is MetadataType mt && mt.IsExplicitLayout); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would simplify this to mt.IsExplicitLayout || mt.IsSequentialLayout
. Maybe even inline it at the callsite.
@@ -803,7 +803,7 @@ protected override ComputedInstanceFieldLayout ComputeInstanceFieldLayout(Metada | |||
return ComputeExplicitFieldLayout(type, numInstanceFields); | |||
} | |||
else | |||
if (type.IsEnum || MarshalUtils.IsBlittableType(type) || IsManagedSequentialType(type)) | |||
if (type.IsEnum || IsBlittableOrManagedSequentialType(type)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To get rid of dependency of the type layout algorithm on blittability here, I think we would simplify this into:
if (type.IsSequential && type.ContainsPointers)
(with matching change in the runtime). It would be R2R breaking change, but it would set us up for future.
The dependency of the layout algorithm on blittability makes changes like #103 R2R breaking changes. We did not treat #103 as a R2R breaking change since the connection between interop, type layout and R2R was not understood.
@@ -304,17 +304,17 @@ protected virtual void FinalizeRuntimeSpecificStaticFieldLayout(TypeSystemContex | |||
{ | |||
} | |||
|
|||
protected static ComputedInstanceFieldLayout ComputeExplicitFieldLayout(MetadataType type, int numInstanceFields) | |||
protected ComputedInstanceFieldLayout ComputeExplicitFieldLayout(MetadataType type, int numInstanceFields) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why you changed this from static to non-static?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At one point it was needed because there used to be a helper method calculating the offset bias. I have double checked it's not longer necessary after all the refactorings, will remove, thanks for pointing that out!
} | ||
|
||
return false; | ||
return type is MetadataType metadataType && (metadataType.IsSequentialLayout || metadataType.IsExplicitLayout); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also delete/replace IsBlittableType
from ComputeInstanceFieldLayout
so that the layout does not depend on the algorithm to compute blittability?
bb2921f
to
b8d0ed4
Compare
The regression test <code>src\tests\JIT\Regressions\JitBlue\Runtime_46239</code> exercises various interesting corner cases of type layout that weren't handled properly in Crossgen2 on x86 and ARM[32]. This change fixes the remaining deficiencies and it also adds provisions for better runtime logging upon type layout mismatches. Thanks Tomas
With this change, the only remaining pipelines using Crossgen1 are "r2r.yml", "r2r-extra.yml" and "release-tests.yml". I haven't yet identified the pipeline running the "release-tests.yml" script; for the "r2r*.yml", these now remain the only pipelines exercising Crossgen1. I don't think it makes sense to switch them over to CG2 as we already have their CG2 counterparts; my expectation is that, once CG1 is finally decommissioned, they will be just deleted. Thanks Tomas
I have basically reverted the marshalling code shuffles as they are no longer necessary based on the direction suggested by JanK. I haven't made the counterpart runtime changes yet, I'm running an initial smoke test to see all the places that blow up. Thanks Tomas
Based on Jan's suggestion I have changed the runtime behavior to align explicit layout structs by default. The funny part here was that the implementation is two-tiered, split into the case with vs. without layout metadata, using two completely different code paths. I have extracted a small part of the EEClassLayoutInfo calculation dealing with primitive field size and alignment into code usable by both codepaths to reduce code duplication. I have also bumped up the R2R compatibility version information as this is (intentionally) a globally breaking change for R2R images produced by older versions. As part of simplifying the various conditions I have deleted a section that claims we shouldn't be updating the number of instance bytes for blittable / managed sequential types. I just hope this is no longer needed so that I don't have to redo the marshalling helper shuffles again. Thanks Tomas
Per JanK's feedback I simplified CoreCLR runtime calculation of explicit layout size but I missed that Crossgen2 actually differs from it in a very subtle manner. This change fixes it by only accepting the explicit size if it's larger than or equal to the unaligned calculated explicit layout size. Thanks Tomas
I have simplified runtime and Crossgen2 behavior in the presence of sequential layout by removing blittability from the picture as the IsBlittable check is very complex and hard to maintain in sync between the native runtime and the managed compiler. I have introduced a somewhat weaker and much simpler check MayContainGCPointers that returns FALSE for all IsBlittable types (and for a few others). Thanks Tomas
In my investigation of the x86 failures I hit an interesting corner case of the class GCMemoryInfoData that is checked for having managed layout matching its native counterpart. Interestingly enough, that only worked because the class internally reported as non-blittable and not managed sequential so it ended up using the auto layout; when my change switched it over to use the sequential layout, it turned out that the sequential layout algorithm differs from its native counterpart. While the change is technically breaking this corner scenario, I suspect that the primary purpose of sequential layout is interop with native code and I find the fact that it's actuall mismatched concerning beyond the scope of my pending change. Thanks Tomas
I originally thought the modification may assist debugging layout issues as I did during the investigation. It however turns out that basically the same class name check can be put in EEClassLayoutInfo::CollectLayoutFieldMetadataThrowing so I am removing this as superfluous temporary instrumentation. Thanks Tomas
@jkotas - I believe I have managed to make some progress on the change based on your suggestions but in accordance with @davidwrighton I think it's even riskier to take than the original attempt using our existing blittability / managed sequential checks. As you can easily imagine, simplifying the condition for using sequential vs. auto layout silently switches over some types from one to the other and the bug tail is enormous, I've been digging through it over the last two weeks and I'm nowhere near green state. Two particular simple examples:
My current plan is to mark this change as NO-MERGE for now and revive its previous version after incorporating Michal's feedback in form of a new PR; I can return to this cleanup after we fork off the final .NET 6 shipping bits - the 1-2 month period of inability to make more substantial runtime changes should be ideal for repeated testing and polishing aspects of this change. In many cases it will be useful / needed to consult official native compiler documentation to make sure we're doing the right / expected thing. |
This change needs substantial more work
Closing for now; I'll revive the PR after we fork off for .NET 6 and I'll start the new CI testing rounds to iron out the various problems. |
The regression test
src\tests\JIT\Regressions\JitBlue\Runtime_46239
exercises various interesting corner cases of type layout that
weren't handled properly in Crossgen2 on x86 and ARM[32]. This
change fixes the remaining deficiencies and it also adds
provisions for better runtime logging upon type layout mismatches.
Thanks
Tomas
/cc @dotnet/crossgen-contrib
P.S. For now I don't expect we need to push this to Preview 4 bits
as the change covers a rather corner case on the less prevalent
architectures; please let me know if you believe otherwise.