Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align arm64 data section as requested #71044

Merged
merged 3 commits into from
Jun 21, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 2 additions & 4 deletions src/coreclr/jit/emit.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6330,11 +6330,9 @@ unsigned emitter::emitEndCodeGen(Compiler* comp,
}

UNATIVE_OFFSET roDataAlignmentDelta = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to your change here, but there is a comment above that reads:

// For arm64/LoongArch64, we want to allocate JIT data always adjacent to code similar to what native compiler does.
// This way allows us to use a single ldr to access such data like float constant/jmp table.
// For LoongArch64 using pcaddi + ld to access such data.

I'm wondering why this is. In particular, x86/x64 explicitly say "don't do this" because it messes with the instruction decoder/cache and can lead to very poor speculative execution, etc.

I would expect Arm64 to have similar limitations and for us to likewise want this data separate from the code. This also includes for other reasons like preventing users from trying to execute "data", etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's reasonable to reconsider. However, on arm64, we have limited addressing mode range for data load instructions. If we put the data in a "data section", we would either have to (1) generate pessimistic code to allow the largest possible range, (2) ensure that data section is "close enough" to the code, or (3) optimistically assume the data is "close enough" to the code, and allow a back-off/retry if not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps @TamarChristinaArm or our other friends at ARM could provide input here on what's the recommended/optimal approach and if Arm64 has similar considerations around having data/instructions close together.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect Arm64 to have similar limitations and for us to likewise want this data separate from the code. This also includes for other reasons like preventing users from trying to execute "data", etc.

Indeed we do have similar issues on Arm64 and the NX bits are of particular interest these days. What we try to do in these cases is to create an anchor to the data section, and then subsequent loads just use offsets from the anchor.

typically we also then consider the anchors cheap to re-materialize to avoid spilling them around call sites etc.

If you're doing NX bits you'd have to allocate new pages for the constants anyway, you could consider getting a page near the code. If you're within the range of an adrp+add you can use the adrp as the anchor.

Copy link
Member

@tannergooding tannergooding Jun 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the insight here! I'll log an issue capturing this and ensuring we consider the potential impact longer term.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logged #71155

if (emitConsDsc.dsdOffs && (emitConsDsc.alignment == TARGET_POINTER_SIZE))
if (emitConsDsc.dsdOffs > 0)
{
UNATIVE_OFFSET roDataAlignment = TARGET_POINTER_SIZE; // 8 Byte align by default.
roDataAlignmentDelta = (UNATIVE_OFFSET)ALIGN_UP(emitTotalHotCodeSize, roDataAlignment) - emitTotalHotCodeSize;
assert((roDataAlignmentDelta == 0) || (roDataAlignmentDelta == 4));
roDataAlignmentDelta = AlignmentPad(emitTotalHotCodeSize, emitConsDsc.alignment);
BruceForstall marked this conversation as resolved.
Show resolved Hide resolved
}

args.hotCodeSize = emitTotalHotCodeSize + roDataAlignmentDelta + emitConsDsc.dsdOffs;
Expand Down