Physical value numbering #68712

SingleAccretion · 2022-04-29T18:01:17Z

The problem

As we know, value numbering disambiguates loads from and stores to fields using field handles as the selectors, essentially meaning that the aliasing model it has presumes no two field accesses of different handles will refer to the same location.

This assumption simply does not hold for struct fields in modern .NET, with people using methods like Unsafe.As more and more. It also does not match the compiler's own internal model well, where precise types are not a unit of type identity (rather, layout compatibility is). Essentially, the field sequence model is too "high level", both for a subset of user code and (in some ways, perhaps more importantly) the IR.

The latter mismatch has caused a good number of bugs, past and present, especially those induced by "zero-offset field sequences", which represent the extreme case of IR-VN mismatch:

#67360
#64805
#64701
#62687
#57282
#32085
dotnet/coreclr#23932
dotnet/coreclr#27108
dotnet/coreclr#20838

The solution

It is the case that the invalid (fragmented) field sequences can only be detected in value numbering itself, no other phase has the information necessary to recognize that this:

var a = ref array[0];

// Several blocks later

Use(Unsafe.As<A, B>(ref a).Field);

is a possibly partial sequence.

Thus, to address the correctness problem we will need to call back into the VM for each field sequence formed, to see if the parent field's type is the owner of the current field (what LocalAddressVisitor does now).

This would solve the correctness problems.

Unfortunately, it has two very significant drawbacks:

It does not address the CQ issues associated with using symbols for inherently aliasable locations (e. g. zero-field sequences "polluting" VNs of addresses, even the simple cases of unions being completely "given up on" by numbering, and StructWithOneField.Field not being equivalent to *(FieldType*)&StructWithOneField).
It will hide compiler bugs related to bad field sequence maintenance. Consider block morphing as an example: today, it will fail to attach proper sequences for a case like this: a[0] = *(ArrayElemType*)&promotedStruct;, where the LHS's type does not match the RHS's.

An alternative approach, implemented in this change, is to "lower" VN to the IR's level and start using physical selectors (offset and size) to select values instead of field handles. It has a good number of advantages:

It is naturally very precise - handles unions, etc.
It should allow us to delete the zero-offset field sequence map.
It will also allow us to delete quite a bit of field sequence maintenance code.
It is "correct by construction" when it comes to aliasing, matching the expectations user code can place on the compiler.
It will free up one pointer-sized field on LCL_FLD nodes, enabling seamless TYP_STRUCT LCL_FLD.
It allows for more equivalences to be discovered for struct values -- today, their VNs are implicitly tied to their precise types (handles), while the IR generally supports equivalence between "compatible" structs, which is a good thing CQ-wise.

Essentially, we get CQ, correctness, and simplify the compiler code outside of VN.

The disadvantages?

Risk & review cost.
Up to 0.2% TP degradation according to PIN for this initial implementation - once the field sequence code is removed, some of that will be gotten back.
A more complex VN model.

I believe the advantages of this change far outweigh the drawbacks.

The implementation

We introduce a new kind of map - "the physical map" - to represent "simple" locations: object fields, static fields, array elements, as well as subsections of them. The field sequence will now be used only to get "the first field map" (and completely unused for array elements), the rest of the struct fields will be represented as "physical selectors": offsets relative to the location the map being updated represents plus the load/store size.

Physical loads will be represented as MapSelect(location, PhysicalSelector). Physical stores will be represented via a new function MapPhysicalStore, mirroring MapStore with the difference being that the selector is always "physical".

At selection time, we will traverse back through MapPhysicalStore, same as with the precise maps (the new term for the old map concept to differentiate it from the physical maps), only use more sophisticated logic to detect whether the selector for the value aliases any stores present. As with the precise maps, we will support map nesting, that is, it will be possible to MapPhysicalStore(parentMap, ..., MapPhysicalStore(...)).

We will also introduce a special kind of map, BitCast that represent the "identity" selection (i. e. that of the whole value), to satisfy our type system that exists outside the physical selection process, where it will be a no-op.

For some more details on the choices made in this design, see the updated comment at the top of valuenum.h.

The diffs

There are some diffs with change. There are two sources (+ a tiny number of diffs for the block morphing workaround).

The first is because we now number cases like below precisely:

N001 (  3,  4) [000463] ------------                        |  \--*  LCL_FLD   byref  V01 arg1         u:3[+0] Fseq[_reference] $287

N001 (  3,  4) [000330] ------------                        \--*  LCL_FLD   byref  V01 arg1         u:3[+0] Fseq[_reference, _value] $287

And so tend to CSE or copy propagate more. On 64 bit platforms that mostly results in better CQ, but on x86 in particular, "not so much". A variation of that is when we "consolidate" what used to be some number of CSEs into one, with a longer lifetime.

I believe it would be prohibitively expensive (in terms of throughput and code clarity) to quirk the behavior to match the old sequence-based one, so the only thing I can propose is to accept the diffs.

The second, a regression, is due to the fact the physical selection scheme does not handle things like *(int*)&long.MaxValue, i. e. selecting from a constant. I am not too worried about this because:

It only shows up in test code.
Support for it could be added easily enough, but I would like to do that separately (if we decide it's worth it at all) as part of adding constant folding to BitCast.

ghost · 2022-04-29T18:01:24Z

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

Let's see how unhappy is the CI.

I still need to update the documentation.

Author:	SingleAccretion
Assignees:	-
Labels:	`area-CodeGen-coreclr`, `community-contribution`
Milestone:	-

SingleAccretion · 2022-05-01T17:58:25Z

@dotnet/jit-contrib I would like to request outerloop, stress and libraries stress.

jakobbotsch · 2022-05-01T18:16:56Z

/azp run runtime-coreclr outerloop, runtime-coreclr jitstress, runtime-coreclr libraries-jitstress

azure-pipelines · 2022-05-01T18:17:23Z

Azure Pipelines successfully started running 3 pipeline(s).

We must know precise load sizes for numbering purposes.

It was breaking the "precise maps only have memory types" invariant.

Not needed in the "identity stores are bitcasts" scheme.

10 diffs across CLR and libraries tests, all correctness fixes.

SingleAccretion · 2022-05-02T13:23:16Z

Outerloop and stress were clean except for known failures / Windows ARM/64 timeouts.

Libraries stress exposed an issue with the pre-existing bug, and there was also one failure I could not reproduce locally and that was not pre-existing, so will request another round of it.

jakobbotsch · 2022-05-02T13:28:16Z

/azp run runtime-coreclr libraries-jitstress, Fuzzlyn

azure-pipelines · 2022-05-02T13:28:31Z

Azure Pipelines successfully started running 2 pipeline(s).

SingleAccretion · 2022-05-02T21:02:49Z

Both libraries stress and Fuzzlyn failures look pre-existing.

Diffs.

@dotnet/jit-contrib

src/coreclr/jit/gentree.cpp

src/coreclr/jit/valuenum.cpp

sandreenko · 2022-05-05T02:02:59Z

wow, I was dreaming about deleting the whole old buggy VN for years!
Great work @SingleAccretion!

jakobbotsch

I agree with @sandreenko above, this is fantastic work. I think the trade offs are more than reasonable, especially given future follow-ups. I wouldn't even say this is a more complex VN model -- while we now have both precise and physical maps, I think this representation is so much more obviously correct and easy to understand given the aliasing model.

I have a few questions:

Are the new VNs costly in terms of memory? It does look like there are some obvious optimizations if they are (e.g. encoding physical selectors without creating a new VN for common small cases)
Does this work mean that field sequences no longer need to be sequences/recursive?
Is doesFieldBelongToClass still needed?

Also, thanks a lot for updating the documentation.

I want to run through a few examples locally for my own understanding, but other than that and a few documentation nits everything LGTM.

src/coreclr/jit/valuenum.h

SingleAccretion · 2022-05-05T11:28:21Z

Are the new VNs costly in terms of memory? It does look like there are some obvious optimizations if they are (e.g. encoding physical selectors without creating a new VN for common small cases)

Not that expensive I suppose:

# CG-ing CoreLib with a native compiler

./diff-mem x64
Relative difference is: 0.083%

./diff-mem x86
Relative difference is: -0.029% # Curiously enough, an improvement.

There is a lot of trickiness with the representation of physical selectors, I spent some time exploring various alternatives. There are two factors that make the current "naive" scheme more optimal than one would expect:

It is fairly common that VNForMapSelectWork stops at the first store map (i. e. load just after the store), and currently that is fairly efficient requiring one selector encode and one VN comparison. E. g. encoding offset and size directly in MapPhysialStore turned out to be a regression (or at least not a clear win).
There is a hot loop in loop hoisting (optVNIsLoopInvariant) where special-casing VNFuncs with non-VN arguments makes for measurable regressions.

I thought quite a bit about splitting the selection map on the physical / precise axis as well, introducing MapPhysicalSelect, but it would mean outlining the PHI selection logic, possibly templatizing it, and that did not seem like too great an idea for an already fairly extensive change. I may revisit that one day though. It is ultimately required for employing smarter physical selector encodings.

Does this work mean that field sequences no longer need to be sequences/recursive?

Correct, the "sequence" part of the field sequence will go. We will have "field offset" CNS_INT nodes, carrying class or static field handles (actually, pointers to FieldInfo), with the following reduction rules:

FIELD    + NO_FIELD => FIELD
NO_FIELD + FIELD    => FIELD
NO_FIELD + NO_FIELD => NO_FIELD
FIELD    + FIELD    => illegal

There are some not-in-VN dependencies on field sequences now (e. g. gtGetStructHandle), they will be deleted.

Is doesFieldBelongToClass still needed?

Nope, it can (and should) be deleted. Not sure how soon will I get to that because it requires updating the Jit-EE interface.

jakobbotsch · 2022-05-06T09:32:26Z

Sounds great, I look forward to the follow-ups.

Ran through some more examples locally, nice to see us VN'ing field accesses through reinterpretations accurately now.

Thanks again for a great change!

[edit] just asked @SingleAccretion for a commit message to include with the change

sandreenko · 2022-05-12T13:24:16Z

@SingleAccretion just out of curiosity does the new VN support such cases:

long l = 0;
int i = Unsafe.As<long, int>(ref l);
if (i == 0) { // Always true.

}

?

SingleAccretion · 2022-05-12T13:48:10Z

@sandreenko "it could but it doesn't".

Currently this will be selected as $VN.LngZero[0:3], reducing this to $VN.IntZero would involve adding something like below to VNForMapSelectWork:

     if (GetVNFunc(map, &funcApp))
     {
     }
+    else if (IsVNConstant(map))
+    {
+        assert(MapIsPhysical(map));
+        unsigned size;
+        unsigned offset = DecodePhysicalSelector(index, &size);
+    
+        assert(size == genTypeSize(type));
+        return VNEvalBitCast(type, map, offset); // This would be the function to implement.
+    }

Currently though, I believe there is not a lot of value in adding this reduction.

sandreenko · 2022-05-12T16:11:40Z

@sandreenko "it could but it doesn't".

Currently this will be selected as $VN.LngZero[0:3], reducing this to $VN.IntZero would involve adding something like below to VNForMapSelectWork:
     if (GetVNFunc(map, &funcApp))
     {
     }
+    else if (IsVNConstant(map))
+    {
+        assert(MapIsPhysical(map));
+        unsigned size;
+        unsigned offset = DecodePhysicalSelector(index, &size);
+    
+        assert(size == genTypeSize(type));
+        return VNEvalBitCast(type, map, offset); // This would be the function to implement.
+    }
Currently though, I believe there is not a lot of value in adding this reduction.

I remember seeing many cases where a struct was initialized using zero-init where the old VN could not then figure out a field value. However, I possibly saw that often because there were many errors in such cases and not because such cases were often.

SingleAccretion · 2022-05-12T16:19:24Z

I remember seeing many cases where a struct was initialized using zero-init where the old VN could not then figure out a field value. However, I possibly saw that often because there were many errors in such cases and not because such cases were often.

Note that structs are handled by the current implementation:

runtime/src/coreclr/jit/valuenum.cpp

Lines 2589 to 2597 in 51d51c1

    
           case VNF_ZeroObj: 
        
               assert(MapIsPhysical(map)); 
        
               // TODO-CQ: support selection of TYP_STRUCT here. 
        
               if (type != TYP_STRUCT) 
        
               { 
        
                   return VNZeroForType(type); 
        
               } 
        
               break;

So if you were to modify your example to something like this:

Guid a = default;

if (*(int*)&a) == 0) { ... }

That branch would get folded (after #68986 is merged, I suppose).

In general the physical selection mechanism is "arbitrarily precise", it will only "give up" on out-of-bounds loads or stores. So it can support TYP_BLK variables, using int locals as byte[4]-like arrays, etc.

ghost added the community-contribution Indicates that the PR has been added by a community member label Apr 29, 2022

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 29, 2022

SingleAccretion force-pushed the Physical-VN-Upstream branch 3 times, most recently from c9273b6 to 531f95f Compare May 1, 2022 15:16

SingleAccretion added 16 commits May 2, 2022 16:06

Work around FIELD turning into IND(struct)

90cb15f

We must know precise load sizes for numbering purposes.

Delete "ROH"

2f7141a

It was breaking the "precise maps only have memory types" invariant.

Physical VN -- locals

4d45ecc

Physical VN -- fields

9a59a6f

Physical VN -- arrays

e8758c6

Deleting VNApplySelectors[Assign]

670a5cd

Delete VNForEmptyMap

b1950c9

Not needed in the "identity stores are bitcasts" scheme.

Add function headers

10e1807

fgValueNumberLocalStore: fix arg ordering

f97c1ad

Add BitCast<type <- ZeroObj> folding

caa1998

Fix block assignment numbering

221fe64

Add more TODO-PhysicalVN notes

fc03ae6

Update the documentation on maps and selection

535b406

Use a 'switch' in VNForMapSelectWork

64b3cf6

Fix formatting

0c6af71

Fix the CAST<float <-> integer> bug

d1d5aea

10 diffs across CLR and libraries tests, all correctness fixes.

SingleAccretion force-pushed the Physical-VN-Upstream branch from 531f95f to d1d5aea Compare May 2, 2022 13:20

SingleAccretion marked this pull request as ready for review May 2, 2022 21:02

jakobbotsch self-assigned this May 3, 2022

jakobbotsch self-requested a review May 3, 2022 19:25

jakobbotsch reviewed May 3, 2022

View reviewed changes

src/coreclr/jit/gentree.cpp Show resolved Hide resolved

jakobbotsch reviewed May 4, 2022

View reviewed changes

src/coreclr/jit/gentree.cpp Show resolved Hide resolved

jakobbotsch reviewed May 4, 2022

View reviewed changes

src/coreclr/jit/gentree.cpp Show resolved Hide resolved

jakobbotsch reviewed May 4, 2022

View reviewed changes

src/coreclr/jit/valuenum.cpp Show resolved Hide resolved

jakobbotsch reviewed May 4, 2022

View reviewed changes

src/coreclr/jit/valuenum.cpp Show resolved Hide resolved

jakobbotsch approved these changes May 5, 2022

View reviewed changes

src/coreclr/jit/valuenum.h Outdated Show resolved Hide resolved

src/coreclr/jit/valuenum.h Outdated Show resolved Hide resolved

Correct documentation mistakes

051224d

jakobbotsch merged commit ed2472b into dotnet:main May 6, 2022

SingleAccretion deleted the Physical-VN-Upstream branch May 6, 2022 19:59

This was referenced May 7, 2022

Deleting field sequences from LCL_FLD and VNF_PtrToArrElem #68986

Merged

Remove GT_ADDR nodes #11057

Closed

SingleAccretion mentioned this pull request May 12, 2022

Do not fold mismatched OBJ(ADDR(LCL_VAR)) #69271

Merged

AndyAyersMS mentioned this pull request May 13, 2022

JIT: suboptimal generated code for struct overlapped field manipulation #69254

Open

SingleAccretion mentioned this pull request May 27, 2022

Delete doesFieldBelongToClass #69900

Closed

JulieLeeMSFT mentioned this pull request Jun 3, 2022

What's new in .NET 7 Preview 5 [WIP] dotnet/core#7441

Closed

ghost locked as resolved and limited conversation to collaborators Jun 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Physical value numbering #68712

Physical value numbering #68712

SingleAccretion commented Apr 29, 2022 •

edited

Loading

ghost commented Apr 29, 2022

SingleAccretion commented May 1, 2022

jakobbotsch commented May 1, 2022

azure-pipelines bot commented May 1, 2022

SingleAccretion commented May 2, 2022 •

edited

Loading

jakobbotsch commented May 2, 2022

azure-pipelines bot commented May 2, 2022

SingleAccretion commented May 2, 2022

sandreenko commented May 5, 2022

jakobbotsch left a comment

SingleAccretion commented May 5, 2022 •

edited

Loading

jakobbotsch commented May 6, 2022 •

edited

Loading

sandreenko commented May 12, 2022

SingleAccretion commented May 12, 2022

sandreenko commented May 12, 2022

SingleAccretion commented May 12, 2022

Physical value numbering #68712

Physical value numbering #68712

Conversation

SingleAccretion commented Apr 29, 2022 • edited Loading

The problem

The solution

The implementation

The diffs

ghost commented Apr 29, 2022

SingleAccretion commented May 1, 2022

jakobbotsch commented May 1, 2022

azure-pipelines bot commented May 1, 2022

SingleAccretion commented May 2, 2022 • edited Loading

jakobbotsch commented May 2, 2022

azure-pipelines bot commented May 2, 2022

SingleAccretion commented May 2, 2022

sandreenko commented May 5, 2022

jakobbotsch left a comment

Choose a reason for hiding this comment

SingleAccretion commented May 5, 2022 • edited Loading

jakobbotsch commented May 6, 2022 • edited Loading

sandreenko commented May 12, 2022

SingleAccretion commented May 12, 2022

sandreenko commented May 12, 2022

SingleAccretion commented May 12, 2022

SingleAccretion commented Apr 29, 2022 •

edited

Loading

SingleAccretion commented May 2, 2022 •

edited

Loading

SingleAccretion commented May 5, 2022 •

edited

Loading

jakobbotsch commented May 6, 2022 •

edited

Loading