-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enregister multireg lclVars #36862
Enregister multireg lclVars #36862
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
runtime/src/coreclr/src/jit/morph.cpp
Lines 10586 to 10592 in e858984
// Can't use field by field assignment if the src is a call. | |
if (src->OperGet() == GT_CALL) | |
{ | |
JITDUMP(" src is a call"); | |
// C++ style CopyBlock with holes | |
requiresCopyBlock = true; | |
} |
I was expecting changes in this part that is currently blocking independent struct promotion for ASG(LCL_VAR struct, call struct)
. Could you please explain how your change avoids that block?
{ | ||
// This should only be called for multireg lclVars. | ||
assert(compiler->lvaEnregMultiRegVars); | ||
assert(tree->IsMultiRegLclVar() || (tree->gtOper == GT_COPY)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused about copy, won't it be better to have tree->IsMultiReg
that recognizes both IsMultiRegLclVar
and copy->GetRegCount() > 1
?
if (!dest->IsMultiRegLclVar() || (blockWidth != destLclVar->lvExactSize) || | ||
(destLclVar->lvCustomLayout && destLclVar->lvContainsHoles)) | ||
{ | ||
// Mark it as DoNotEnregister. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it mark struct A { bool a; int b; }
as DoNotEnregister?
54ae2d7
to
c912772
Compare
3742328
to
b87798d
Compare
The liveness model for multi-reg lclVar stores is challenging, since they both use and define multiple registers. Multireg calls (i.e. that define multiple registers), and multireg returns (that consume multiple registers don't suffer from the same issues as all the defs (in the former case) or the uses (in the latter case) can be modeled as occuring simultaneously. This will also be the case (eventually) for intrinsics that define multiple registers. For multi-reg lclVar stores, we don't want to model them as occuring simultaneously, as then we must guarantee that all uses and defs have no conflicts, and without adding non-trivial complexity to the register allocator, that would actually mean that we couldn't support an assignment that reuses the source registers without spill. Take the following example with a 2-register return:
What we want this to generate is a simple call and return with no register spills or copies. While we want to ensure that the second field doesn't occupy the first return register at the point of the So, the model for a Getting this right is a bit tricky and requires factoring out some of the liveness, spill and GC updates. |
9396207
to
8dfb4fc
Compare
Regarding
With these changes, that will remain a full struct assignment, with the destination lclVar being marked with
After register allocation:
Code Generated:
|
@sandreenko - I think this is ready for another round of review. I'm not sure why all the test builds (not the test runs) failed for the jitstressregs leg, and similarly the perf test failures didn't seem related. I'm attempting to re-run them. |
8dfb4fc
to
a031193
Compare
@dnceng @dotnet/jit-contrib - Can someone help me figure out how to determine what's going wrong with the perf runs? The log shows:
It's the same error for both the "Linux x64 release coreclr net5.0" and "Linux x64 release mono net5.0". It seems unlikely that this is an issue introduced with my PR. In the past I've found it possible to miss actual failures in these perf runs, but it reports 578 benchmarks run, and there are 578 instances of "Process xxx exited with code 0", so it doesn't appear to be an execution failure. |
@dotnet/runtime-infrastructure @DrewScoggins Can you answer @CarolEidt 's question about problems with the |
@dotnet/dnceng @dotnet/jit-contrib - I'm also having failures in the managed test build for the jitstressregs pipeline. The only error I can find in the log is here:
|
Sorry I missed this, thought it was your previous dnceng tag :) Taking a peek. |
|
I am currently trying your changes with |
The perf run issue is because of a bug in dotnet.exe, we just tracked it down yesterday and checked in a workaround dotnet/performance#1346. You should not see that behavior on performance runs any longer. |
src/coreclr/src/jit/lower.cpp
Outdated
CheckMultiRegLclVar(op1->AsLclVar(), &retTypeDesc); | ||
} | ||
} | ||
#else // !FEATURE_MULTIREG_RET |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why was this block put under !FEATURE_MULTIREG_RET
?
It breaks compDoOldStructRetyping == false
logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed it; I'm not sure why I made that change.
// | ||
// Arguments: | ||
// tree - the GT_COPY node | ||
// multiRegIndex - The index of the register to be copied |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: formatting in this header is not consistent, asingle register
, The index
, -when the source
, to the register allocated to the register
.
regNumber CodeGen::genRegCopy(GenTree* treeNode, unsigned multiRegIndex) | ||
{ | ||
assert(treeNode->OperGet() == GT_COPY); | ||
GenTree* op1 = treeNode->AsOp()->gtOp1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GenTree* op1 = treeNode->AsOp()->gtOp1; | |
GenTree* op1 = copyNode ->gtGetOp1(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to treeNode->gtGetOp1()
assert(op1->IsMultiRegNode()); | ||
|
||
GenTreeCopyOrReload* copyNode = treeNode->AsCopyOrReload(); | ||
// GenTreeCopyOrReload only reports the highest index that has a valid register. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this comment repeats a few lines below and regCount
is used only in an assert, maybe delete this block?
src/coreclr/src/jit/codegenarm64.cpp
Outdated
// var = call, where call returns a multi-reg return value | ||
// case is handled separately. | ||
if (data->gtSkipReloadOrCopy()->IsMultiRegCall()) | ||
// Multi-reg nodes are handled separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IsMultiReg
is only for GenTreeLclVarCommon
, IsMultiRegNode
and IsMultiRegLclVar
are for all tree, is it correct?
Maybe rename so:
GenTree
has IsMultiRegNode
, IsMultiRegLclVar(virtual)
GenTreeLclVarCommon
has IsMultiRegLclVar
.
or rename tree
to lclVar
in this function.
Also, I think this comment could be confusing, maybe Stores from a multi-reg source are handled separately
?
What does tree->IsMultiReg()
return when data->gtSkipReloadOrCopy()->IsMultiRegNode() == true
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IsMultiReg is only for GenTreeLclVarCommon, IsMultiRegNode and IsMultiRegLclVar are for all tree, is it correct?
Actually, isMultiReg
is only for GenTreeLclVar
(it must be GT_LCL_VAR
not other variants).
Maybe rename so:
GenTree has IsMultiRegNode, IsMultiRegLclVar(virtual)
GenTreeLclVarCommon has IsMultiRegLclVar.
I don't think we really have much of a "standard" for this, but since we don't really support virtual methods on GenTree
, I believe we generally keep the names distinct.
I'll rename tree
to lclNode
(I think that lclVar
is confusing because one might expect it to be a LclVarDsc*
. I'll make the same change to the version in codegenxarch.cpp.
// mov dst[i], reg[0] | ||
// This effectively moves from `reg[0]` to `dst[i]`, leaving other dst bits unchanged till further | ||
// iterations | ||
// For the case where reg == dst, if we iterate so that we write dst[0] last, we eliminate the need for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we save a mov when reg == dst
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly, but this is pre-existing code (factored out of genMultiRegStoreToLocal
) so I'd prefer not to make that change here.
// use reg #1 from src, including any reload or copy | ||
// define reg #1 | ||
// If we defined it as using all the source registers, there would be more | ||
// conflicts and higher register pressure. In addition, it complicates the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please explain why we will have higher register pressure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment; perhaps it's overkill there or should be moved somewhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I forgot that we have not had an algorithm to find an optimal move sequence for such chains.
// | ||
regNumber CodeGen::genConsumeReg(GenTree* tree, unsigned multiRegIndex) | ||
{ | ||
if (tree->OperGet() == GT_COPY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
off-topic: I prefer OperIs(GT_COPY)
because it is usually shorter and doesn't not need additional brackets in && conditions.
void Compiler::fgComputeLifeUntrackedLocal(VARSET_TP& life, | ||
// | ||
// Returns: | ||
// `true` if the node is a dead store (i.e. all fields are dead); `false` otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it currently return true somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry - I meant to respond to this. Yes, it returns true under the if (isDef)
condition:
// None of the fields were live, so this is a dead store.
if (!opts.MinOpts())
{
// keepAliveVars always stay alive
VARSET_TP keepAliveFields(VarSetOps::Intersection(this, fieldSet, keepAliveVars));
noway_assert(VarSetOps::IsEmpty(this, keepAliveFields));
// Do not consider this store dead if the parent local variable is an address exposed local.
return !varDsc.lvAddrExposed;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in general, one issue in Lower with !compDoOldStructRetyping
support and a few questions/nits.
(comp->lvaGetPromotionType(varDsc) != Compiler::PROMOTION_TYPE_INDEPENDENT) || | ||
(varDsc->lvFieldCnt > MAX_MULTIREG_COUNT)) | ||
{ | ||
lclNode->ClearMultiReg(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we check (varDsc->lvFieldCnt > MAX_MULTIREG_COUNT)
during importation and avoid setting MultiReg
for such nodes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see us make these decisions in Lowering
, since eventually we'd like to be able to promote register-passed structs with more than MAX_MULTIREG_COUNT
fields (i.e. multiple fields packed into a single register).
src/coreclr/src/jit/lower.cpp
Outdated
// | ||
// Arguments: | ||
// lclNode - the GT_LCL_VAR node | ||
// retTypeDesc - a return type descriptor for the consuming node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
retTypeDesc
can be both a user of this LCL_VAR
: (RET(LCL_VAR)
and the source STORE_LCL_VAR(call)
, is it correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it can come from a GT_CALL
source or a GT_RETURN
user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the comments.
a031193
to
37f9dd3
Compare
if (fieldVarDsc->lvTracked && fgLocalVarLivenessDone && // Includes local variable liveness | ||
((tree->gtFlags & GTF_VAR_DEATH) != 0)) | ||
if (fieldVarDsc->lvTracked && fgLocalVarLivenessDone && | ||
tree->AsLclVar()->IsLastUse(i - varDsc->lvFieldLclStart)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should probably be AsLclVarCommon
or it will fail for GT_LCL_FLD
.
Edit: or check that it is a LCL_VAR
, as I see Common
does not have IsLastUse
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a check for tree->IsMultiRegLclVar()
, as that's a more precise condition that it has the last-use bits.
12e673e
to
8e6a875
Compare
src/coreclr/src/jit/importer.cpp
Outdated
@@ -1440,6 +1440,19 @@ GenTree* Compiler::impAssignStructPtr(GenTree* destAddr, | |||
} | |||
else if (compDoOldStructRetyping()) | |||
{ | |||
if (dest->OperIs(GT_LCL_VAR) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Git has missed this merge conflict, I changed the condition from:
else
{
dest->gtType = asgType;
}
to
else if (compDoOldStructRetyping())
{
dest->gtType = asgType;
}
and you added code under the original else // no condition
that should not be under the change condition.
Could you please move if (compDoOldStructRetyping())
to dest->gtType = asgType;
?
Allow struct lclVars that are returned in multiple registers to be enregistered, as long as the fields are a match for the registers. Fix dotnet#34105
Undo change to `fgMorphBlkNode()`
Extract common code for `genMultiRegStoreToLocal` Fix last use for multireg when extending lifetimes Fix call dump
df4ae36
to
90f1990
Compare
The jitstressregs leg has no new failures. |
Looks good, do you have diffs for your changes? |
Here are the diffs:
No diffs for x64/windows or x86. The regressions are cases where we promote where we didn't previously and which weren't mitigated by my earlier struct improvements. I expect we'll recover many/most of those when we enable enregistering of incoming arguments. |
Allow struct lclVars that are returned in multiple registers to be
enregistered, as long as the fields are a match for the registers.
Fix #34105