Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Interlocked for small types. #92974

Merged
merged 67 commits into from
Jan 25, 2024
Merged
Show file tree
Hide file tree
Changes from 41 commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
4344057
Implement Interlocked for small types.
MichalPetryka Oct 3, 2023
d282722
Fix xarch assert
MichalPetryka Oct 3, 2023
4079320
Update PalRedhawkInline.h
MichalPetryka Oct 3, 2023
dae9d66
Update PalRedhawkInline.h
MichalPetryka Oct 4, 2023
4b74dba
Update threads.c
MichalPetryka Oct 4, 2023
c6935de
Simplify NAOT handling
MichalPetryka Oct 4, 2023
095567c
Fix build
MichalPetryka Oct 4, 2023
35a812f
Update atomic.h
MichalPetryka Oct 4, 2023
f55fb45
Update atomic.h
MichalPetryka Oct 4, 2023
b1c6dde
Update atomic.h
MichalPetryka Oct 4, 2023
8e80946
Fix windows intrinsics
MichalPetryka Oct 5, 2023
a25a4a3
Update comutilnative.cpp
MichalPetryka Oct 5, 2023
da399ca
Fix Windows builds
MichalPetryka Oct 5, 2023
79c07ff
Update icall-def.h
MichalPetryka Oct 6, 2023
7da080b
Fix icall-def.h
akoeplinger Oct 6, 2023
37fdee1
Improve tests
MichalPetryka Oct 6, 2023
358c11b
Update InterlockedTests.cs
MichalPetryka Oct 6, 2023
b0f85fa
Remove small type normalization
MichalPetryka Oct 6, 2023
f406c21
Try to fix the intrinsics
MichalPetryka Oct 6, 2023
59dbd5e
Fix ARM64 build
MichalPetryka Oct 6, 2023
42f1c80
Try to fix XArch 66 prefix and ARM64
MichalPetryka Oct 7, 2023
70174d7
Fix typo
MichalPetryka Oct 7, 2023
f108f6d
Fix assert
MichalPetryka Oct 7, 2023
6d09d88
Code cleanup
MichalPetryka Oct 8, 2023
d3fbf3e
Update importercalls.cpp
MichalPetryka Oct 8, 2023
92b7497
Update importercalls.cpp
MichalPetryka Oct 8, 2023
0394523
Extend small types properly
MichalPetryka Oct 8, 2023
1ee1c3e
Add RISC-V asserts
MichalPetryka Oct 9, 2023
064b7b3
Merge remote-tracking branch 'origin/main' into small-interlocked
lambdageek Oct 10, 2023
25a5194
c11 atomics interlocked ops for small types
lambdageek Oct 10, 2023
22407e4
Simplify Windows atomics
MichalPetryka Oct 10, 2023
50893e2
Merge upstream
MichalPetryka Oct 11, 2023
0cc7d92
Restore NativeAOT fallbacks for other platforms
MichalPetryka Oct 11, 2023
868efdf
Merge upstream
MichalPetryka Oct 17, 2023
d8efc0d
Add more tests
MichalPetryka Oct 21, 2023
6c101a4
Format
MichalPetryka Oct 21, 2023
25e3ab4
Update Interlocked.cs
MichalPetryka Oct 21, 2023
3a3e60e
Merge upstream
MichalPetryka Oct 27, 2023
4ca06ff
Fix tests and comment
MichalPetryka Oct 27, 2023
1879669
Merge branch 'main' into small-interlocked
MichalPetryka Nov 3, 2023
5536554
Merge branch 'dotnet:main' into small-interlocked
MichalPetryka Dec 1, 2023
ed90e8b
Add emitOutputCV handling
MichalPetryka Dec 1, 2023
1c2a4cc
Merge
MichalPetryka Dec 10, 2023
fca6d47
Fix extension
MichalPetryka Dec 10, 2023
ca73b06
Use a slightly better fix
MichalPetryka Dec 10, 2023
d067d7f
More complete fix
MichalPetryka Dec 10, 2023
bbbadc0
Fix the fix
MichalPetryka Dec 10, 2023
1b8767b
Fix more places
MichalPetryka Dec 10, 2023
d357407
CR feedback
jkotas Jan 10, 2024
724f66c
Merge branch 'main' into small-interlocked
MichalPetryka Jan 10, 2024
499ea14
Merge remote-tracking branch 'upstream/main' into small-interlocked
MichalPetryka Jan 20, 2024
55f7cc1
Fix corelib
MichalPetryka Jan 20, 2024
00cdcb7
Fix tests
MichalPetryka Jan 20, 2024
382d8ea
Add using
MichalPetryka Jan 20, 2024
1099743
Match NativeAOT #ifs
MichalPetryka Jan 20, 2024
82f6fbe
Update Interlocked.cs
MichalPetryka Jan 20, 2024
1a7c4a2
Fix test projects
MichalPetryka Jan 21, 2024
7f97233
Fix tests, try linking with libatomic
MichalPetryka Jan 21, 2024
ce412f9
Improve build test
MichalPetryka Jan 21, 2024
ec2c380
Fix tests
MichalPetryka Jan 23, 2024
7b8869a
Merge branch 'main' into small-interlocked
MichalPetryka Jan 23, 2024
2acc845
Merge remote-tracking branch 'upstream/main' into small-interlocked
MichalPetryka Jan 24, 2024
ea35c1e
Fix tests again
MichalPetryka Jan 24, 2024
35f840d
Remove libatomic
MichalPetryka Jan 24, 2024
de8f617
Merge remote-tracking branch 'upstream/main' into small-interlocked
MichalPetryka Jan 24, 2024
7af81b3
Cleanup, fix helper extension
MichalPetryka Jan 24, 2024
72e5bf8
Fix test corelib
MichalPetryka Jan 24, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,24 @@ public static long Decrement(ref long location) =>
#endregion

#region Exchange
/// <summary>Sets a 8-bit unsigned integer to a specified value and returns the original value, as an atomic operation.</summary>
/// <param name="location1">The variable to set to the specified value.</param>
/// <param name="value">The value to which the <paramref name="location1"/> parameter is set.</param>
/// <returns>The original value of <paramref name="location1"/>.</returns>
/// <exception cref="NullReferenceException">The address of location1 is a null pointer.</exception>
[Intrinsic]
[MethodImpl(MethodImplOptions.InternalCall)]
public static extern byte Exchange(ref byte location1, byte value);
AaronRobinsonMSFT marked this conversation as resolved.
Show resolved Hide resolved

/// <summary>Sets a 16-bit signed integer to a specified value and returns the original value, as an atomic operation.</summary>
/// <param name="location1">The variable to set to the specified value.</param>
/// <param name="value">The value to which the <paramref name="location1"/> parameter is set.</param>
/// <returns>The original value of <paramref name="location1"/>.</returns>
/// <exception cref="NullReferenceException">The address of location1 is a null pointer.</exception>
[Intrinsic]
[MethodImpl(MethodImplOptions.InternalCall)]
public static extern short Exchange(ref short location1, short value);

/// <summary>Sets a 32-bit signed integer to a specified value and returns the original value, as an atomic operation.</summary>
/// <param name="location1">The variable to set to the specified value.</param>
/// <param name="value">The value to which the <paramref name="location1"/> parameter is set.</param>
Expand Down Expand Up @@ -87,6 +105,26 @@ public static T Exchange<T>([NotNullIfNotNull(nameof(value))] ref T location1, T
#endregion

#region CompareExchange
/// <summary>Compares two 8-bit unsigned integers for equality and, if they are equal, replaces the first value.</summary>
/// <param name="location1">The destination, whose value is compared with <paramref name="comparand"/> and possibly replaced.</param>
/// <param name="value">The value that replaces the destination value if the comparison results in equality.</param>
/// <param name="comparand">The value that is compared to the value at <paramref name="location1"/>.</param>
/// <returns>The original value in <paramref name="location1"/>.</returns>
/// <exception cref="NullReferenceException">The address of <paramref name="location1"/> is a null pointer.</exception>
[Intrinsic]
[MethodImpl(MethodImplOptions.InternalCall)]
public static extern byte CompareExchange(ref byte location1, byte value, byte comparand);

/// <summary>Compares two 16-bit signed integers for equality and, if they are equal, replaces the first value.</summary>
/// <param name="location1">The destination, whose value is compared with <paramref name="comparand"/> and possibly replaced.</param>
/// <param name="value">The value that replaces the destination value if the comparison results in equality.</param>
/// <param name="comparand">The value that is compared to the value at <paramref name="location1"/>.</param>
/// <returns>The original value in <paramref name="location1"/>.</returns>
/// <exception cref="NullReferenceException">The address of <paramref name="location1"/> is a null pointer.</exception>
[Intrinsic]
[MethodImpl(MethodImplOptions.InternalCall)]
public static extern short CompareExchange(ref short location1, short value, short comparand);

/// <summary>Compares two 32-bit signed integers for equality and, if they are equal, replaces the first value.</summary>
/// <param name="location1">The destination, whose value is compared with <paramref name="comparand"/> and possibly replaced.</param>
/// <param name="value">The value that replaces the destination value if the comparison results in equality.</param>
Expand Down
22 changes: 18 additions & 4 deletions src/coreclr/inc/winwrap.h
Original file line number Diff line number Diff line change
Expand Up @@ -254,18 +254,32 @@ WszCreateProcess(
LPPROCESS_INFORMATION lpProcessInformation
);

#if defined(HOST_X86) && defined(_MSC_VER)
#if defined(_MSC_VER)

//
// Windows SDK does not use intrinsics on x86. Redefine the interlocked operations to use intrinsics.
// Redefine the interlocked operations to intrinsics
// to avoid missing InterlockedCompareExchange8 and Windows 8 requirement.
jkotas marked this conversation as resolved.
Show resolved Hide resolved
//

#include "intrin.h"

#define InterlockedIncrement _InterlockedIncrement
#define InterlockedDecrement _InterlockedDecrement
#define InterlockedExchange8 _InterlockedExchange8
#define InterlockedCompareExchange8 _InterlockedCompareExchange8
jkotas marked this conversation as resolved.
Show resolved Hide resolved
#define InterlockedExchange16 _InterlockedExchange16
#define InterlockedCompareExchange16 _InterlockedCompareExchange16
#define InterlockedExchange _InterlockedExchange
#define InterlockedCompareExchange _InterlockedCompareExchange

#endif // _MSC_VER

#if defined(HOST_X86) && defined(_MSC_VER)

//
// Windows SDK does not use intrinsics on x86. Redefine the interlocked operations to use intrinsics.
jkotas marked this conversation as resolved.
Show resolved Hide resolved
//

#define InterlockedIncrement _InterlockedIncrement
#define InterlockedDecrement _InterlockedDecrement
#define InterlockedExchangeAdd _InterlockedExchangeAdd
#define InterlockedCompareExchange64 _InterlockedCompareExchange64
#define InterlockedAnd _InterlockedAnd
Expand Down
81 changes: 69 additions & 12 deletions src/coreclr/jit/codegenarm64.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3800,6 +3800,8 @@ void CodeGen::genLockedInstructions(GenTreeOp* treeNode)
genConsumeAddress(addr);
genConsumeRegs(data);

assert(treeNode->OperIs(GT_XCHG) || !varTypeIsSmall(treeNode->TypeGet()));

emitAttr dataSize = emitActualTypeSize(data);

if (compiler->compOpportunisticallyDependsOn(InstructionSet_Atomics))
Expand All @@ -3822,8 +3824,19 @@ void CodeGen::genLockedInstructions(GenTreeOp* treeNode)
break;
}
case GT_XCHG:
GetEmitter()->emitIns_R_R_R(INS_swpal, dataSize, dataReg, targetReg, addrReg);
{
instruction ins = INS_swpal;
if (varTypeIsByte(treeNode->TypeGet()))
{
ins = INS_swpalb;
}
else if (varTypeIsShort(treeNode->TypeGet()))
{
ins = INS_swpalh;
}
GetEmitter()->emitIns_R_R_R(ins, dataSize, dataReg, targetReg, addrReg);
break;
}
case GT_XADD:
GetEmitter()->emitIns_R_R_R(INS_ldaddal, dataSize, dataReg, (targetReg == REG_NA) ? REG_ZR : targetReg,
addrReg);
Expand Down Expand Up @@ -3882,8 +3895,21 @@ void CodeGen::genLockedInstructions(GenTreeOp* treeNode)
BasicBlock* labelRetry = genCreateTempLabel();
genDefineTempLabel(labelRetry);

instruction insLd = INS_ldaxr;
instruction insSt = INS_stlxr;
if (varTypeIsByte(treeNode->TypeGet()))
{
insLd = INS_ldaxrb;
insSt = INS_stlxrb;
}
else if (varTypeIsShort(treeNode->TypeGet()))
{
insLd = INS_ldaxrh;
insSt = INS_stlxrh;
}

// The following instruction includes a acquire half barrier
GetEmitter()->emitIns_R_R(INS_ldaxr, dataSize, loadReg, addrReg);
GetEmitter()->emitIns_R_R(insLd, dataSize, loadReg, addrReg);

switch (treeNode->OperGet())
{
Expand All @@ -3909,7 +3935,7 @@ void CodeGen::genLockedInstructions(GenTreeOp* treeNode)
}

// The following instruction includes a release half barrier
GetEmitter()->emitIns_R_R_R(INS_stlxr, dataSize, exResultReg, storeDataReg, addrReg);
GetEmitter()->emitIns_R_R_R(insSt, dataSize, exResultReg, storeDataReg, addrReg);

GetEmitter()->emitIns_J_R(INS_cbnz, EA_4BYTE, labelRetry, exResultReg);

Expand All @@ -3918,8 +3944,14 @@ void CodeGen::genLockedInstructions(GenTreeOp* treeNode)
gcInfo.gcMarkRegSetNpt(addr->gtGetRegMask());
}

if (treeNode->GetRegNum() != REG_NA)
if (targetReg != REG_NA)
{
if (varTypeIsSmall(treeNode->TypeGet()) && varTypeIsSigned(treeNode->TypeGet()))
{
instruction mov = varTypeIsShort(treeNode->TypeGet()) ? INS_sxth : INS_sxtb;
GetEmitter()->emitIns_Mov(mov, EA_4BYTE, targetReg, targetReg, /* canSkip */ false);
}

genProduceReg(treeNode);
}
}
Expand Down Expand Up @@ -3947,18 +3979,27 @@ void CodeGen::genCodeForCmpXchg(GenTreeCmpXchg* treeNode)
genConsumeRegs(data);
genConsumeRegs(comparand);

emitAttr dataSize = emitActualTypeSize(data);

if (compiler->compOpportunisticallyDependsOn(InstructionSet_Atomics))
{
emitAttr dataSize = emitActualTypeSize(data);

// casal use the comparand as the target reg
GetEmitter()->emitIns_Mov(INS_mov, dataSize, targetReg, comparandReg, /* canSkip */ true);

// Catch case we destroyed data or address before use
noway_assert((addrReg != targetReg) || (targetReg == comparandReg));
noway_assert((dataReg != targetReg) || (targetReg == comparandReg));

GetEmitter()->emitIns_R_R_R(INS_casal, dataSize, targetReg, dataReg, addrReg);
instruction ins = INS_casal;
if (varTypeIsByte(treeNode->TypeGet()))
{
ins = INS_casalb;
}
else if (varTypeIsShort(treeNode->TypeGet()))
{
ins = INS_casalh;
}
GetEmitter()->emitIns_R_R_R(ins, dataSize, targetReg, dataReg, addrReg);
}
else
{
Expand Down Expand Up @@ -3992,9 +4033,6 @@ void CodeGen::genCodeForCmpXchg(GenTreeCmpXchg* treeNode)

gcInfo.gcMarkRegPtrVal(addrReg, addr->TypeGet());

// TODO-ARM64-CQ Use ARMv8.1 atomics if available
// https://github.com/dotnet/runtime/issues/8225

// Emit code like this:
// retry:
// ldxr targetReg, [addrReg]
Expand All @@ -4009,8 +4047,21 @@ void CodeGen::genCodeForCmpXchg(GenTreeCmpXchg* treeNode)
BasicBlock* labelCompareFail = genCreateTempLabel();
genDefineTempLabel(labelRetry);

instruction insLd = INS_ldaxr;
instruction insSt = INS_stlxr;
if (varTypeIsByte(treeNode->TypeGet()))
{
insLd = INS_ldaxrb;
insSt = INS_stlxrb;
}
else if (varTypeIsShort(treeNode->TypeGet()))
{
insLd = INS_ldaxrh;
insSt = INS_stlxrh;
}

// The following instruction includes a acquire half barrier
GetEmitter()->emitIns_R_R(INS_ldaxr, emitTypeSize(treeNode), targetReg, addrReg);
GetEmitter()->emitIns_R_R(insLd, dataSize, targetReg, addrReg);

if (comparand->isContainedIntOrIImmed())
{
Expand All @@ -4032,7 +4083,7 @@ void CodeGen::genCodeForCmpXchg(GenTreeCmpXchg* treeNode)
}

// The following instruction includes a release half barrier
GetEmitter()->emitIns_R_R_R(INS_stlxr, emitTypeSize(treeNode), exResultReg, dataReg, addrReg);
GetEmitter()->emitIns_R_R_R(insSt, dataSize, exResultReg, dataReg, addrReg);

GetEmitter()->emitIns_J_R(INS_cbnz, EA_4BYTE, labelRetry, exResultReg);

Expand All @@ -4043,6 +4094,12 @@ void CodeGen::genCodeForCmpXchg(GenTreeCmpXchg* treeNode)
gcInfo.gcMarkRegSetNpt(addr->gtGetRegMask());
}

if (varTypeIsSmall(treeNode->TypeGet()) && varTypeIsSigned(treeNode->TypeGet()))
{
instruction mov = varTypeIsShort(treeNode->TypeGet()) ? INS_sxth : INS_sxtb;
GetEmitter()->emitIns_Mov(mov, EA_4BYTE, targetReg, targetReg, /* canSkip */ false);
}

genProduceReg(treeNode);
}

Expand Down
3 changes: 3 additions & 0 deletions src/coreclr/jit/codegenriscv64.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2611,6 +2611,8 @@ void CodeGen::genJumpTable(GenTree* treeNode)
//
void CodeGen::genLockedInstructions(GenTreeOp* treeNode)
{
assert(!varTypeIsSmall(treeNode->TypeGet()));

GenTree* data = treeNode->AsOp()->gtOp2;
GenTree* addr = treeNode->AsOp()->gtOp1;
regNumber dataReg = data->GetRegNum();
Expand Down Expand Up @@ -2664,6 +2666,7 @@ void CodeGen::genLockedInstructions(GenTreeOp* treeNode)
void CodeGen::genCodeForCmpXchg(GenTreeCmpXchg* treeNode)
{
assert(treeNode->OperIs(GT_CMPXCHG));
assert(!varTypeIsSmall(treeNode->TypeGet()));

GenTree* locOp = treeNode->Addr();
GenTree* valOp = treeNode->Data();
Expand Down
26 changes: 22 additions & 4 deletions src/coreclr/jit/codegenxarch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4353,14 +4353,15 @@ void CodeGen::genCodeForLockAdd(GenTreeOp* node)
void CodeGen::genLockedInstructions(GenTreeOp* node)
{
assert(node->OperIs(GT_XADD, GT_XCHG));
assert(node->OperIs(GT_XCHG) || !varTypeIsSmall(node->TypeGet()));

GenTree* addr = node->gtGetOp1();
GenTree* data = node->gtGetOp2();
emitAttr size = emitTypeSize(node->TypeGet());

assert(addr->isUsedFromReg());
assert(data->isUsedFromReg());
assert((size == EA_4BYTE) || (size == EA_PTRSIZE) || (size == EA_GCREF));
assert((size <= EA_PTRSIZE) || (size == EA_GCREF));

genConsumeOperands(node);

Expand All @@ -4378,7 +4379,15 @@ void CodeGen::genLockedInstructions(GenTreeOp* node)
instGen(INS_lock);
}

GetEmitter()->emitIns_AR_R(ins, size, node->GetRegNum(), addr->GetRegNum(), 0);
regNumber targetReg = node->GetRegNum();
GetEmitter()->emitIns_AR_R(ins, size, targetReg, addr->GetRegNum(), 0);

if (varTypeIsSmall(node->TypeGet()))
{
instruction mov = varTypeIsSigned(node->TypeGet()) ? INS_movsx : INS_movzx;
GetEmitter()->emitIns_Mov(mov, size, targetReg, targetReg, /* canSkip */ false);
}

genProduceReg(node);
}

Expand All @@ -4394,6 +4403,7 @@ void CodeGen::genCodeForCmpXchg(GenTreeCmpXchg* tree)

var_types targetType = tree->TypeGet();
regNumber targetReg = tree->GetRegNum();
emitAttr size = emitTypeSize(tree->TypeGet());

GenTree* location = tree->Addr(); // arg1
GenTree* value = tree->Data(); // arg2
Expand All @@ -4414,10 +4424,18 @@ void CodeGen::genCodeForCmpXchg(GenTreeCmpXchg* tree)
// location is Rm
instGen(INS_lock);

GetEmitter()->emitIns_AR_R(INS_cmpxchg, emitTypeSize(targetType), value->GetRegNum(), location->GetRegNum(), 0);
GetEmitter()->emitIns_AR_R(INS_cmpxchg, size, value->GetRegNum(), location->GetRegNum(), 0);

// Result is in RAX
inst_Mov(targetType, targetReg, REG_RAX, /* canSkip */ true);
if (varTypeIsSmall(tree->TypeGet()))
{
instruction mov = varTypeIsSigned(tree->TypeGet()) ? INS_movsx : INS_movzx;
GetEmitter()->emitIns_Mov(mov, size, targetReg, REG_RAX, /* canSkip */ false);
}
else
{
inst_Mov(targetType, targetReg, REG_RAX, /* canSkip */ true);
}

genProduceReg(tree);
}
Expand Down
15 changes: 13 additions & 2 deletions src/coreclr/jit/emitxarch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4045,10 +4045,11 @@ UNATIVE_OFFSET emitter::emitInsSizeAM(instrDesc* id, code_t code)

assert((attrSize == EA_4BYTE) || (attrSize == EA_PTRSIZE) // Only for x64
|| (attrSize == EA_16BYTE) || (attrSize == EA_32BYTE) || (attrSize == EA_64BYTE) // only for x64
|| (ins == INS_movzx) || (ins == INS_movsx)
|| (ins == INS_movzx) || (ins == INS_movsx) || (ins == INS_cmpxchg)
// The prefetch instructions are always 3 bytes and have part of their modr/m byte hardcoded
|| isPrefetch(ins));
size = 3;

size = (attrSize == EA_2BYTE) && (ins == INS_cmpxchg) ? 4 : 3;
}
else
{
Expand Down Expand Up @@ -12564,6 +12565,11 @@ BYTE* emitter::emitOutputAM(BYTE* dst, instrDesc* id, code_t code, CnsVal* addc)
}
else if (code & 0x00FF0000)
{
if ((size == EA_2BYTE) && (ins == INS_cmpxchg))
{
dst += emitOutputByte(dst, 0x66);
}

// BT supports 16 bit operands and this code doesn't handle the necessary 66 prefix.
assert(ins != INS_bt);

Expand Down Expand Up @@ -13384,6 +13390,11 @@ BYTE* emitter::emitOutputSV(BYTE* dst, instrDesc* id, code_t code, CnsVal* addc)
}
else if (code & 0x00FF0000)
{
if ((size == EA_2BYTE) && (ins == INS_cmpxchg))
{
dst += emitOutputByte(dst, 0x66);
}

// BT supports 16 bit operands and this code doesn't add the necessary 66 prefix.
assert(ins != INS_bt);

Expand Down
Loading
Loading