Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JIT] Enable EGPRs in JIT by adding REX2 encoding to the backend. #106557

Merged
merged 41 commits into from
Dec 17, 2024
Merged
Changes from 1 commit
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
1820567
Ruihan: POC with REX2
Ruihan-Yin Mar 25, 2024
d1afc68
resolve comments
Ruihan-Yin May 17, 2024
2335aa3
refactor register encoding for REX2
Ruihan-Yin May 20, 2024
6578c58
merge REX2 path to legacy path
Ruihan-Yin May 21, 2024
01eeb80
Enable REX2 in more instructions.
Ruihan-Yin May 30, 2024
690aee3
Avoid repeatedly estimate the size of REX2 prefix
Ruihan-Yin Jun 3, 2024
31d7fb4
Enable REX2 encoding on RI and SV path
Ruihan-Yin Jun 5, 2024
a995878
Add rex2 support to rotate and shift.
Ruihan-Yin Jun 6, 2024
74aacf6
CR session.
Ruihan-Yin Jun 7, 2024
c330927
Testing infra updates: assert REX2 is enabled.
Ruihan-Yin Jun 11, 2024
fbf20d1
revert rcl_N and rcr_N, tp and latency data for these instructions is…
Ruihan-Yin Jun 11, 2024
ea02e70
partially enable REX2 on emitOutputAM, case covered: R_AR and AR_R.
Ruihan-Yin Jun 12, 2024
c74b801
Adding unit tests.
Ruihan-Yin Jun 13, 2024
34980b4
push, pop, inc, dec, neg, not, xadd, shld, shrd, cmpxchg, setcc, bswap.
Ruihan-Yin Jun 26, 2024
2ffdbeb
bug fix for bswap
Ruihan-Yin Jun 27, 2024
3a729bb
bt
Ruihan-Yin Jun 28, 2024
d943b03
xchg, idiv
Ruihan-Yin Jul 1, 2024
c8fee9c
Make sure add REX2 prefix if register encoding for EGPRs are being ca…
Ruihan-Yin Jul 2, 2024
6ec0e97
Ensure code size is correctly computed in R_R_I path.
Ruihan-Yin Jul 8, 2024
1d01003
clean up
Ruihan-Yin Jul 9, 2024
1acc219
Change all AddSimdPrefix to AddX86Prefix
Ruihan-Yin Jul 15, 2024
87ad443
div, mulEAX
Ruihan-Yin Jul 16, 2024
bb9905a
filter out test from REX2 encoding when using ACC form.
Ruihan-Yin Jul 19, 2024
86083b2
Make sure REX prefix will not be added when emitting with REX2.
Ruihan-Yin Jul 24, 2024
dfe8760
resolve comments.
Ruihan-Yin Aug 5, 2024
64761cd
make sure the APX debug knob is only available under debug build.
Ruihan-Yin Oct 24, 2024
f1aba62
clean up some out-dated code.
Ruihan-Yin Nov 12, 2024
f5cc5a8
enable movsxd
Ruihan-Yin Nov 12, 2024
7ca8433
Enable "Call"
Ruihan-Yin Nov 13, 2024
bc4d225
Enable "JMP"
Ruihan-Yin Nov 15, 2024
deb3814
resolve merge errors
Ruihan-Yin Nov 18, 2024
0d63230
formatting
Ruihan-Yin Nov 18, 2024
13b8076
remote coredistools.dll for internal tests only
Ruihan-Yin Nov 18, 2024
42c6cfc
bug fix
Ruihan-Yin Nov 19, 2024
2e2eb01
resolve comments
Ruihan-Yin Nov 20, 2024
3d298b7
add more emitter tests.
Ruihan-Yin Nov 22, 2024
25a54d3
resolve comments.
Ruihan-Yin Dec 2, 2024
791b505
clean up some comments and tweak the REX2 stress logic
Ruihan-Yin Dec 4, 2024
094e76b
clean up
Ruihan-Yin Dec 4, 2024
6502ae1
formatting.
Ruihan-Yin Dec 4, 2024
5d3cca2
resolve comments.
Ruihan-Yin Dec 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Enable REX2 encoding on RI and SV path
 - SV path is mostly for debugging purposes

Added encoding unit tests for instructions with immediates
  • Loading branch information
Ruihan-Yin committed Nov 19, 2024
commit 31d7fb4201ff100ed7d350c0cab168c8a0e8f267
17 changes: 17 additions & 0 deletions src/coreclr/jit/codegenxarch.cpp
Original file line number Diff line number Diff line change
@@ -9068,6 +9068,7 @@ void CodeGen::genAmd64EmitterUnitTestsApx()
theEmitter->emitComp->JitStressRex2Encoding(true);

theEmitter->emitIns_R_R(INS_add, EA_4BYTE, REG_EAX, REG_ECX);
theEmitter->emitIns_R_R(INS_add, EA_2BYTE, REG_EAX, REG_ECX);
theEmitter->emitIns_R_R(INS_or, EA_4BYTE, REG_EAX, REG_ECX);
theEmitter->emitIns_R_R(INS_adc, EA_4BYTE, REG_EAX, REG_ECX);
theEmitter->emitIns_R_R(INS_sbb, EA_4BYTE, REG_EAX, REG_ECX);
@@ -9088,6 +9089,22 @@ void CodeGen::genAmd64EmitterUnitTestsApx()
theEmitter->emitIns_R_R(INS_popcnt, EA_4BYTE, REG_EAX, REG_ECX);
theEmitter->emitIns_R_R(INS_lzcnt, EA_4BYTE, REG_EAX, REG_ECX);
theEmitter->emitIns_R_R(INS_tzcnt, EA_4BYTE, REG_EAX, REG_ECX);

theEmitter->emitIns_R_I(INS_add, EA_4BYTE, REG_ECX, 0x05);
theEmitter->emitIns_R_I(INS_add, EA_2BYTE, REG_ECX, 0x05);
theEmitter->emitIns_R_I(INS_or, EA_4BYTE, REG_EAX, 0x05);
theEmitter->emitIns_R_I(INS_adc, EA_4BYTE, REG_EAX, 0x05);
theEmitter->emitIns_R_I(INS_sbb, EA_4BYTE, REG_EAX, 0x05);
theEmitter->emitIns_R_I(INS_and, EA_4BYTE, REG_EAX, 0x05);
theEmitter->emitIns_R_I(INS_sub, EA_4BYTE, REG_EAX, 0x05);
theEmitter->emitIns_R_I(INS_xor, EA_4BYTE, REG_EAX, 0x05);
theEmitter->emitIns_R_I(INS_cmp, EA_4BYTE, REG_EAX, 0x05);
theEmitter->emitIns_R_I(INS_test, EA_4BYTE, REG_EAX, 0x05);

theEmitter->emitIns_R_I(INS_mov, EA_4BYTE, REG_EAX, 0xE0);

// JIT tend to compress imm64 to imm32 if higher half is all-zero, make sure this test checks the path for imm64.
theEmitter->emitIns_R_I(INS_mov, EA_8BYTE, REG_RAX, 0xFFFF000000000000);
}

#endif // defined(DEBUG) && defined(TARGET_AMD64)
51 changes: 46 additions & 5 deletions src/coreclr/jit/emitxarch.cpp
Original file line number Diff line number Diff line change
@@ -2884,13 +2884,31 @@ unsigned emitter::emitGetAdjustedSize(instrDesc* id, code_t code) const
// The 4-Byte SSE instructions require one additional byte to hold the ModRM byte
adjustedSize++;
}
else
else if (IsRex2EncodableInstruction(ins))
{
if(TakesRex2Prefix(id))
unsigned prefixAdjustedSize = 0;
if (TakesRex2Prefix(id))
{
adjustedSize += 2;
prefixAdjustedSize = 2;
// If the opcode will be prefixed by REX2, then all the map-1-legacy instructions can remove the escape prefix
if(IsLegacyMap1(code))
{
prefixAdjustedSize -= 1;
}
}

adjustedSize = prefixAdjustedSize;

emitAttr attr = id->idOpSize();

if ((attr == EA_2BYTE) && (ins != INS_movzx) && (ins != INS_movsx))
{
// Most 16-bit operand instructions will need a 0x66 prefix.
adjustedSize++;
}
}
else
{
if (ins == INS_crc32)
{
// Adjust code size for CRC32 that has 4-byte opcode but does not use SSE38 or EES3A encoding.
@@ -2930,7 +2948,7 @@ unsigned emitter::emitGetPrefixSize(instrDesc* id, code_t code, bool includeRexP
return emitGetVexPrefixSize(id);
}

if (IsRex2EncodableInstruction(id->idIns()) && hasRex2Prefix(code))
if (hasRex2Prefix(code))
{
return 2;
}
@@ -14221,6 +14239,12 @@ BYTE* emitter::emitOutputSV(BYTE* dst, instrDesc* id, code_t code, CnsVal* addc)
// Therefore, add VEX or EVEX prefix if one is not already present.
code = AddSimdPrefixIfNeededAndNotPresent(id, code, size);

if(TakesRex2Prefix(id))
{
// There are some callers who already add prefix and call this routine.
code = hasRex2Prefix(code) ? code : AddRex2Prefix(ins, code);
}

// Compute the REX prefix
if (TakesRexWPrefix(id))
{
@@ -15989,6 +16013,11 @@ BYTE* emitter::emitOutputRI(BYTE* dst, instrDesc* id)
// This is INS_mov and will not take VEX prefix
assert(!TakesVexPrefix(ins));

if(TakesRex2Prefix(id))
{
code = AddRex2Prefix(ins, code);
}

if (TakesRexWPrefix(id))
{
code = AddRexWPrefix(id, code);
@@ -16097,7 +16126,14 @@ BYTE* emitter::emitOutputRI(BYTE* dst, instrDesc* id)
else
{
code = insCodeMI(ins);
code = AddSimdPrefixIfNeeded(id, code, size);
if(TakesRex2Prefix(id))
{
code = AddRex2Prefix(ins, code);
}
else
{
code = AddSimdPrefixIfNeeded(id, code, size);
}
code = insEncodeMIreg(id, reg, size, code);
}
}
@@ -18111,6 +18147,11 @@ size_t emitter::emitOutputInstr(insGroup* ig, instrDesc* id, BYTE** dp)
code = insEncodeReg3456(id, id->idReg1(), size, code);
}

if (TakesRex2Prefix(id))
{
code = AddRex2Prefix(ins, code);
}

regcode = (insEncodeReg345(id, id->idReg1(), size, &code) << 8);
dst = emitOutputSV(dst, id, code | regcode);
}