Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SIMD data overallocation #71043

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 15 additions & 3 deletions src/coreclr/jit/emit.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7275,7 +7275,12 @@ UNATIVE_OFFSET emitter::emitDataGenBeg(unsigned size, unsigned alignment, var_ty
}

assert((secOffs % alignment) == 0);
emitConsDsc.alignment = max(emitConsDsc.alignment, alignment);
if (emitConsDsc.alignment < alignment)
{
JITDUMP("Increasing data section alignment from %u to %u for type %s\n", emitConsDsc.alignment, alignment,
varTypeName(dataType));
emitConsDsc.alignment = alignment;
}

/* Advance the current offset */
emitConsDsc.dsdOffs += size;
Expand Down Expand Up @@ -7673,7 +7678,7 @@ void emitter::emitOutputDataSec(dataSecDsc* sec, BYTE* dst)

if (emitComp->opts.disAsm)
{
emitDispDataSec(sec);
emitDispDataSec(sec, dst);
}

unsigned secNum = 0;
Expand Down Expand Up @@ -7794,25 +7799,32 @@ void emitter::emitOutputDataSec(dataSecDsc* sec, BYTE* dst)
//
// Arguments:
// section - the data section description
// dst - address of the data section
//
// Notes:
// The output format attempts to mirror typical assembler syntax.
// Data section entries lack type information so float/double entries
// are displayed as if they are integers/longs.
//
void emitter::emitDispDataSec(dataSecDsc* section)
void emitter::emitDispDataSec(dataSecDsc* section, BYTE* dst)
{
printf("\n");

unsigned offset = 0;

for (dataSection* data = section->dsdList; data != nullptr; data = data->dsNext)
{
if (emitComp->opts.disAddr)
{
printf("; @" FMT_ADDR "\n", DBG_ADDR(dst));
}

Comment on lines +7817 to +7821
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an example of what this displayed previously vs what it displays now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you set COMPlus_JitDasmWithAddress=1 you'll see:

; @ 000002cc`0d8d4fc4 
RWD00  	dq	0020002000200020h, 0020002000200020h
; @ 000002cc`0d8d4fd4 
RWD16  	dq	006200720065006Bh, 0073006F00720065h
; @ 000002cc`0d8d4fe4 
RWD32  	dq	006F00670065006Eh, 0074006100690074h
; @ 000002cc`0d8d4ff4 
RWD48  	dq	0074006F00670065h, 0065007400610069h

where the ; @ ... lines are new.

Copy link
Member

@tannergooding tannergooding Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd assume those be excluded in DiffableDasm (I've never used JitDasmWithAddress, didn't know it existed 👍)?

Also its worth noting that the address shown there is a multiple of 4, but I would have expected it to be a multiple of 16 (should probably be 000002cc_0d8d4fd0, not 000002cc_0d8d4fc4). Was this done with MIN_DATA_ALIGN or is there a different bug here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you set JitDasmWithAddress, you've given up on diffable (and JitDasmWithAddress is normally not set).

As for the multiple of 4 issue: that's what I've fixed with #71044.

const char* labelFormat = "%-7s";
char label[64];
sprintf_s(label, ArrLen(label), "RWD%02u", offset);
printf(labelFormat, label);
offset += data->dsSize;
dst += data->dsSize;

if ((data->dsType == dataSection::blockRelative32) || (data->dsType == dataSection::blockAbsoluteAddr))
{
Expand Down
2 changes: 1 addition & 1 deletion src/coreclr/jit/emit.h
Original file line number Diff line number Diff line change
Expand Up @@ -2555,7 +2555,7 @@ class emitter

void emitOutputDataSec(dataSecDsc* sec, BYTE* dst);
#ifdef DEBUG
void emitDispDataSec(dataSecDsc* section);
void emitDispDataSec(dataSecDsc* section, BYTE* dst);
#endif

/************************************************************************/
Expand Down
11 changes: 8 additions & 3 deletions src/coreclr/jit/lower.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7384,6 +7384,7 @@ void Lowering::LowerSIMD(GenTreeSIMD* simdNode)

if (arg->IsCnsFltOrDbl())
{
noway_assert(constArgCount < ArrLen(constArgValues));
constArgValues[constArgCount] = static_cast<float>(arg->AsDblCon()->gtDconVal);
constArgCount++;
}
Expand All @@ -7396,10 +7397,14 @@ void Lowering::LowerSIMD(GenTreeSIMD* simdNode)
BlockRange().Remove(arg);
}

assert(sizeof(constArgValues) == 16);
// For SIMD12, even though there might be 12 bytes of constants, we need to store 16 bytes of data
// since we've bashed the node the TYP_SIMD16 and do a 16-byte indirection.
assert(varTypeIsSIMD(simdNode));
const unsigned cnsSize = genTypeSize(simdNode);
assert(cnsSize <= sizeof(constArgValues));

unsigned cnsSize = sizeof(constArgValues);
unsigned cnsAlign = (comp->compCodeOpt() != Compiler::SMALL_CODE) ? cnsSize : 1;
const unsigned cnsAlign =
(comp->compCodeOpt() != Compiler::SMALL_CODE) ? cnsSize : emitter::dataSection::MIN_DATA_ALIGN;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I peeked at emitSimd8Const, emitSimd16Const, and emitSimd32Const and they all look to be doing the right thing here, but it might be good for someone else to double check.

This change looks good/correct to me, but it would probably be better if we lowered this to GT_CNS_VEC and went through the new normalized code path. Longer term, it would be good if we could get rid of the last 2-3 usages of SIMDIntrinsicInitN and replace them with SimdAsHWIntrinsic nodes instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it might be good for someone else to double check.

Looks fine to me.


CORINFO_FIELD_HANDLE hnd =
comp->GetEmitter()->emitBlkConst(constArgValues, cnsSize, cnsAlign, simdNode->GetSimdBaseType());
Expand Down