Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow StoreLclVar src to be IND/FLD. #59315

Merged
merged 4 commits into from
May 23, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion src/coreclr/jit/codegenxarch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5016,7 +5016,17 @@ void CodeGen::genCodeForIndir(GenTreeIndir* tree)
else
{
genConsumeAddress(addr);
emit->emitInsLoadInd(ins_Load(targetType), emitTypeSize(tree), tree->GetRegNum(), tree);
instruction loadIns = ins_Load(targetType);
if (tree->DontExtend())
{
assert(varTypeIsSmall(tree));
// The user of this IND does not need
// the upper bits to be set, so we don't need to use longer
// INS_movzx/INS_movsx and can use INS_mov instead.
// It usually happens when the real type is a small struct.
loadIns = INS_mov;
}
emit->emitInsLoadInd(loadIns, emitTypeSize(tree), tree->GetRegNum(), tree);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic allows us to generate STORE_LCL_VAR struct<1> V01(IND struct(addr) as

mov al, [addr]
mov  [V11], al

instead of

movsx rax, [addr]
mov  [V11], al

The solution with additional flag does not look very nice but it was better than alternatives that I have tried:

  1. support contained IND under STORE_LCL_VAR: it required too many changes in LSRA/codegen/emitter and it was hard to make it work for all cases and it created many new checks for this scenario;
  2. type the IND as int so we do mov rax, [addr] instead of mov al, [addr]. This could be a solution without any additional diffs but I am not sure if this transformation is correct.
    We currently use it for nullchecks already since https://github.com/dotnet/runtime/pull/37991/files so NULLCHECK byte addr is currently transformed into NULLCHECK int addr but if the addr is the last byte on a last readable page won't it cause issues? Maybe we should reconsider this transformation (in
    ind->gtType = TYP_INT;
    )
    could somebody please confirm if it can't cause issues?

image

}

genProduceReg(tree);
Expand Down
26 changes: 25 additions & 1 deletion src/coreclr/jit/gentree.h
Original file line number Diff line number Diff line change
Expand Up @@ -527,9 +527,16 @@ enum GenTreeFlags : unsigned int
// alignment of 1 byte)
GTF_IND_INVARIANT = 0x01000000, // GT_IND -- the target is invariant (a prejit indirection)
GTF_IND_NONNULL = 0x00400000, // GT_IND -- the indirection never returns null (zero)
#if defined(TARGET_XARCH)
GTF_IND_DONT_EXTEND = 0x00200000, // GT_IND -- the indirection does not need to extend for small types
#endif // TARGET_XARCH

GTF_IND_FLAGS = GTF_IND_VOLATILE | GTF_IND_NONFAULTING | GTF_IND_TLS_REF | GTF_IND_UNALIGNED | GTF_IND_INVARIANT |
GTF_IND_NONNULL | GTF_IND_TGT_NOT_HEAP | GTF_IND_TGT_HEAP,
GTF_IND_NONNULL | GTF_IND_TGT_NOT_HEAP | GTF_IND_TGT_HEAP
#if defined(TARGET_XARCH)
| GTF_IND_DONT_EXTEND
#endif // TARGET_XARCH
,

GTF_ADDRMODE_NO_CSE = 0x80000000, // GT_ADD/GT_MUL/GT_LSH -- Do not CSE this node only, forms complex
// addressing mode
Expand Down Expand Up @@ -6640,6 +6647,23 @@ struct GenTreeIndir : public GenTreeOp
return (gtFlags & GTF_IND_UNALIGNED) != 0;
}

#if defined(TARGET_XARCH)
void SetDontExtend()
{
gtFlags |= GTF_IND_DONT_EXTEND;
}

void ClearDontExtend()
{
gtFlags &= ~GTF_IND_DONT_EXTEND;
}

bool DontExtend() const
{
return (gtFlags & GTF_IND_DONT_EXTEND) != 0;
}
#endif // TARGET_XARCH

#if DEBUGGABLE_GENTREE
// Used only for GenTree::GetVtableForOper()
GenTreeIndir() : GenTreeOp()
Expand Down
8 changes: 7 additions & 1 deletion src/coreclr/jit/liveness.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2006,7 +2006,13 @@ void Compiler::fgComputeLifeLIR(VARSET_TP& life, BasicBlock* block, VARSET_VALAR

if (isDeadStore && fgTryRemoveDeadStoreLIR(node, lclVarNode, block))
{
lclVarNode->Data()->SetUnusedValue();
GenTree* data = lclVarNode->Data();
data->SetUnusedValue();

if (data->isIndir())
{
Lowering::TransformUnusedIndirection(data->AsIndir(), this, block);
}
}
break;
}
Expand Down
58 changes: 47 additions & 11 deletions src/coreclr/jit/lower.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3525,14 +3525,49 @@ void Lowering::LowerStoreLocCommon(GenTreeLclVarCommon* lclStore)
convertToStoreObj = false;
}
}
else if (!src->OperIs(GT_LCL_VAR))
else if (src->OperIs(GT_LCL_VAR))
{
convertToStoreObj = false;
}
else if (src->OperIs(GT_IND, GT_OBJ, GT_BLK, GT_LCL_FLD))
{
#if !defined(TARGET_ARM64)

if (src->TypeIs(TYP_STRUCT))
{
src->ChangeType(lclRegType);
if (src->OperIs(GT_IND, GT_OBJ, GT_BLK))
{
if (src->OperIs(GT_OBJ, GT_BLK))
{
src->SetOper(GT_IND);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless there is an assert somewhere that makes sure that currOper != newOper, just remove the if (src->OperIs(GT_OBJ, GT_BLK)) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is RecordOperBashing calls inside and I remember that I used to have breakpoints there, I would prefer not to call it if it is not neccessary.

}
// This logic is skipped for struct indir in
// `Lowering::LowerIndir` because we don't know the size.
// Do it now.
GenTreeIndir* indir = src->AsIndir();
LowerIndir(indir);
#if defined(TARGET_XARCH)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is defined(TARGET_XARCH) needed and rest of the code inside !defined(TARGET_ARM64) includes arm as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have GTF_IND_DONT_EXTEND only for x64 because there mov rax, [addr] is smaller than INS_movzx/INS_movsx al, [addr]. On other platforms it is not defined and not needed.

if (varTypeIsSmall(lclRegType))
{
indir->SetDontExtend();
}
#endif // TARGET_XARCH
}
}
convertToStoreObj = false;
#else // TARGET_ARM64
// This optimization on arm64 allows more SIMD16 vars to be enregistered but it could cause
// regressions when there are many calls and before/after each one we have to store/save the upper
// half of these registers. So enable this for arm64 only when LSRA is taught not to allocate registers when
// it would have to spilled too many times.
convertToStoreObj = true;
#endif // TARGET_ARM64
}
else
{
assert(src->OperIs(GT_LCL_VAR));
convertToStoreObj = false;
assert(src->OperIsInitVal());
convertToStoreObj = true;
}

if (convertToStoreObj)
Expand Down Expand Up @@ -7214,6 +7249,7 @@ void Lowering::TransformUnusedIndirection(GenTreeIndir* ind, Compiler* comp, Bas
bool useNullCheck = false;
#else // TARGET_XARCH
bool useNullCheck = !ind->Addr()->isContained();
ind->ClearDontExtend();
#endif // !TARGET_XARCH

if (useNullCheck && !ind->OperIs(GT_NULLCHECK))
Expand Down Expand Up @@ -7311,14 +7347,6 @@ bool Lowering::TryTransformStoreObjAsStoreInd(GenTreeBlk* blkNode)
return false;
}

if (varTypeIsSmall(regType) && !src->IsConstInitVal() && !src->IsLocal())
{
// source operand INDIR will use a widening instruction
// and generate worse code, like `movzx` instead of `mov`
// on x64.
return false;
}

JITDUMP("Replacing STORE_OBJ with STOREIND for [%06u]\n", blkNode->gtTreeID);
blkNode->ChangeOper(GT_STOREIND);
blkNode->ChangeType(regType);
Expand All @@ -7341,6 +7369,14 @@ bool Lowering::TryTransformStoreObjAsStoreInd(GenTreeBlk* blkNode)
{
assert(src->TypeIs(regType) || src->IsCnsIntOrI() || src->IsCall());
}

#if defined(TARGET_XARCH)
if (varTypeIsSmall(regType) && src->OperIs(GT_IND))
{
src->AsIndir()->SetDontExtend();
}
#endif // TARGET_XARCH

LowerStoreIndirCommon(blkNode->AsStoreInd());
return true;
}
Expand Down