-
Notifications
You must be signed in to change notification settings - Fork 2.7k
ValueNum add check for ZeroOffsetFldSeq on LclVar reads #20838
Conversation
Desktop Asm Diffs are minimal:
|
@AndyAyersMS PTAL |
I will add a test case |
Test case added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just have some questions ...
src/jit/valuenum.cpp
Outdated
assert((nextNode->gtOper == GT_ADDR && nextNode->gtOp.gtOp1 == lcl) || | ||
varTypeIsStruct(lcl->TypeGet())); | ||
lcl->gtVNPair.SetBoth(vnStore->VNForExpr(compCurBB, lcl->TypeGet())); | ||
// Temporary, to make progress. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize that this was pre-existing, but this comment should be more detailed, and describe why this is temporary, and what kind of an assert should be re-added (i.e. should we be asserting that it's not NoVN
at this point?)
Also, AFAICT this code section is the only substantive change. When making lots of refactoring/renaming changes, it would be really useful to call attention to where the real changes are and/or to have them in separate commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I will do that in the future. You are correct that the fix is due to the change to
Call ExtendPtrVN with any ZeroOffsetFldSeq on LclVar reads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At thsi point I believe that this comment should just be removed:
// TODO-CQ: This should become an assert again...
As this isn't a code quality issue and the two cases where this can happen are explained below.
|
||
using System; | ||
|
||
// Test case for issues with a missing check for ZeroOffsetFldSeq values on LclVar reads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little more detail here and/or on the specific cases would be informative for future reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the fix several cases fail because value numbering gives us the old value instead of the modified value.
The original bug was submitted by the Fuzzlyn C# generator program.
(see issue #18259)
I wrote up this test case to cover alot of the different cases.
Without my fix this test case prints:
D:\Public\Bugs\GitHub_18259>CoreRun.exe GitHub_18259.exe
Failed - Test_e0_S2_F3_F1() - 640
Failed - Test_e0_S2_F4_F1() - 960
Failed - Test_e1_S2_F3_F1() - 640
Failed - Test_e1_S2_F4_F1() - 960
Failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also have a smaller test case that I can add as well.
if (addrExtended != ValueNumStore::NoVN) | ||
{ | ||
lcl->gtVNPair.SetBoth(addrExtended); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If addrExtended
was NoVN
, is it correct that we neither have generateUniqueVN
set to true, nor have we set lcl->gtVNPair
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I orginally did try that out (setting generateUniqueVN) but it turs out that doing that caused a few codegen regressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So - what value number will lcl
get in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code above dealing with wholeLclVarVNP
will either assign
lcl->gtVNPair = wholeLclVarVNP;
or
lcl->gtVNPair = partialLclVarVNP;
or
set generateUniqueVN = true;
For the code regression case we had (typSize == varSize)
so we assigned wholeLclVarVNP
and this allowed the address of a struct byref to be CSE-ed with the address of a (struct byref + zero-offset field)
src/jit/valuenum.cpp
Outdated
@@ -3662,24 +3661,47 @@ ValueNum ValueNumStore::VNApplySelectorsAssign( | |||
} | |||
|
|||
// Otherwise, fldHnd is a real field handle. | |||
CORINFO_FIELD_HANDLE fldHnd = fieldSeq->m_fieldHnd; | |||
CORINFO_FIELD_HANDLE fldHnd = fieldSeq->m_fieldHnd; | |||
CORINFO_CLASS_HANDLE structHnd = NO_CLASS_HANDLE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find where is the actual fix and where is refactoring.
I do not see any uses of structHnd
, why do we need this variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove this unused variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would appreciate it if the main commit for the fix contained a bit more detail about what the bug was and why it lead to bad codegen (and whether it was a regression and if so when/how..., and if not why we seemingly overlooked this case before or we just never saw it for some reason). Or if not suitable for the commit, at least put this in the initial note to reviewers.
I realize you can go back to the original bug and dig some of this out or reverse parse it from the new tests but it takes a while. Also the initial part of an issue usually contains a lot of guesswork.
The fix looks plausible but I wonder if there are other cases like this lurking about.
The problem here was that we have an indirection of a LclVar that was a pointer to an array element whose type is a struct. The following discussion refers to the test case GitHub_18259/GitHub_18259.cs We recorded that the struct field F1.F0 is assigned 1234u. With the bug all subsequent reads of vr7.F0 return this value. We miss the update to zero because we didn't add the Zero Field sequence value to the LclVar vr7 Added Test case GitHub_18259.cs Added Test case GitHub_20838.cs
81f8f91
to
a35f062
Compare
Additional small test case added |
@AndyAyersMS PTAL I do plan on investigating this area for adfditional work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think this is ok.
Those zero offset cases always seem to be troublemakers. I wonder if we should try and rework how they're implemented to make them look like nonzero offset accesses, at least early on in the jit.
ValueNum add check for ZeroOffsetFldSeq on LclVar reads Commit migrated from dotnet/coreclr@ae4dc9c
Call ExtendPtrVN with any ZeroOffsetFldSeq on LclVar reads
Fixes issue #18259