Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISC-V] Transfer arguments between calling conventions in shuffling thunks #107282

Merged
merged 66 commits into from
Oct 9, 2024

Conversation

tomeksowi
Copy link
Contributor

Properly implements the transfer of arguments between integer and hardware floating-point calling conventions (i.e. lowering and delowering) in shuffling thunks. The hitherto implementation was lacking in several ways:

  • used fld (8-byte) to shuffle float from stack to register, which is wrong because the stack slot is not NaN-boxed
  • did not support FP structs passed in one stack slot
  • did not respect size and offsets of FP struct fields in general
  • did not support shuffling stack slots to the right (when the delowered FP struct was larger than the lowered one but there was another lowering)
  • did not support the case when the shuffling thunk must allocate stack space (the above case but without the second lowering)

The new implementation does away with the omnibus loop in StubLinkerCPU::EmitShuffleThunk in favor of tighter case-by-case loops. That way shuffling all within integer calling convention (vast majority of cases) is simple and the more involved cases with calling convention transfers are complete while still not needing all-out graph sorting.

Note: this PR is about correctness, optimization TODOs will not be pursued here.

Stems from #101796, part of #84834, cc @dotnet/samsung

…nstead of an omnibus loop handling ShuffleEntries
…pStructs. EmptyStructs test passes except for ShufflingThunk_FloatEmptyShort_DoubleFloatNestedEmpty_RiscV
…he key points first, which simplifies code and solves some corner cases e.g. where we can't assume struct stack size by checking the size + offset of the last field
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Sep 3, 2024
@clamp03 clamp03 added the arch-riscv Related to the RISC-V architecture label Sep 3, 2024
src/coreclr/vm/comdelegate.cpp Show resolved Hide resolved
src/coreclr/vm/comdelegate.cpp Outdated Show resolved Hide resolved
src/coreclr/vm/comdelegate.cpp Outdated Show resolved Hide resolved
src/coreclr/vm/comdelegate.cpp Outdated Show resolved Hide resolved
src/coreclr/vm/class.cpp Outdated Show resolved Hide resolved
@risc-vv
Copy link

risc-vv commented Oct 9, 2024

7fe3c4e is being scheduled for building and testing

GIT: 7fe3c4e280859c42f84e3d469a1390425994c8b3
REPO: dotnet/runtime
BRANCH: main

Release-CLR-build FAILED

buildinfo.json

Compilation failed during coreclr-tests build```

</details><details>
<summary>Release-CLR-build FAILED</summary>

[buildinfo.json](https://gist.githubusercontent.com/risc-vv/6411da95b7ae685527b4b3b006589c09/raw/5b5e8e923238ab703922c2d51656bee8b3df98ed/7fe3c4e2_buildinfo.json)
```bash
Compilation failed during coreclr-tests build```

</details>

@risc-vv
Copy link

risc-vv commented Oct 9, 2024

659a9e4 is being scheduled for building and testing

GIT: 659a9e4a4327e17a78389dd1146da5a2b09688c5
REPO: dotnet/runtime
BRANCH: main

Release-CLR-build FAILED

buildinfo.json

Compilation failed during coreclr-tests build```

</details>

@risc-vv
Copy link

risc-vv commented Oct 9, 2024

ccf1e4e is being scheduled for building and testing

GIT: ccf1e4e41dd453d443542f8db6dbe453baf20859
REPO: dotnet/runtime
BRANCH: main

Release-build FAILED

buildinfo.json

Compilation failed during core build```

</details>

@am11
Copy link
Member

am11 commented Oct 9, 2024

src/tasks/installer.tasks/installer.tasks.csproj(0,0): error NU1903: (NETCORE_ENGINEERING_TELEMETRY=Restore) Warning As Error: Package 'System.Text.Json' 8.0.4 has a known high severity vulnerability, GHSA-8g4q-xg66-9fp4

Not sure why it has started to fail just now, but the fix is to update this to 8.0.5

<SystemTextJsonToolsetVersion>8.0.4</SystemTextJsonToolsetVersion>

@risc-vv
Copy link

risc-vv commented Oct 9, 2024

e45db16 is being scheduled for building and testing

GIT: e45db16aef6380df6d04b570049a8d3cee2f8611
REPO: dotnet/runtime
BRANCH: main

Release-CLR-build FAILED

buildinfo.json

Compilation failed during coreclr-tests build```

</details>

@tomeksowi
Copy link
Contributor Author

@sirntar Our internal CI will probably fail again on crossgen2_publish.csproj after #107772. We need to add -p:UsePublishedCrossgen2=false to src/tests/build.sh or wait for #108693 to merge.

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@jkotas jkotas merged commit 4f60693 into dotnet:main Oct 9, 2024
94 of 96 checks passed
@tomeksowi
Copy link
Contributor Author

@LuckyXu-HF @shushanhf LoongArch also needs this.

@am11
Copy link
Member

am11 commented Oct 10, 2024

Thanks! Regarding the preference for IL thunks in certain contexts vs. hand-written assembly in others; is this decision typically driven by specific performance characteristics or other context-dependent factors? Or is there a broader goal to gradually move away from hand-rolled assembly in the coreclr/vm?

@jkotas
Copy link
Member

jkotas commented Oct 10, 2024

Regarding the preference for IL thunks in certain contexts vs. hand-written assembly in others; is this decision typically driven by specific performance characteristic

Yes, most of the remaining hand-generated assembly thunks exist in CoreCLR for performance reasons, both startup performance and throughput.

Some of the performance reasons may be historic. For example, GenerateInitPInvokeFrameHelper is a perf micro-optimizations for x86/arm32 only. It is not clear whether it is still beneficial and whether the perf benefit (if there is any) is worth the extra complexity.

@shushanhf
Copy link
Contributor

@LuckyXu-HF @shushanhf LoongArch also needs this.

OK, Thanks

@LuckyXu-HF
Copy link
Contributor

@LuckyXu-HF @shushanhf LoongArch also needs this.

OK, Thanks

Thank you! We will invest in this PR later.
BTW could you please share some tests which can cover this PR?

@tomeksowi
Copy link
Contributor Author

tomeksowi commented Oct 10, 2024

Thank you! We will invest in this PR later.

I think it'll be pretty simple, most of it will be just enabling RISC-V #ifdefs for LoongArch and trimming down StubLinkerCPU::EmitShuffleThunk like on RISC-V, I think you already have the necessary Emit* methods implemented.

BTW could you please share some tests which can cover this PR?

I included them in this PR

#region ShufflingThunks_RiscVTests
[MethodImpl(MethodImplOptions.NoInlining)]
private static void ShufflingThunk_EmptyFloatEmpty5Byte_RiscV(
int a1_to_a0, int a2_to_a1, int a3_to_a2, int a4_to_a3, int a5_to_a4, int a6_to_a5, int a7_to_a6,
float fa0, float fa1, float fa2, float fa3, float fa4, float fa5, float fa6,
EmptyFloatEmpty5Byte stack0_stack1_to_fa7_a7,
int stack2_to_stack0, float fa7_to_stack1)
{
Assert.Equal(0, a1_to_a0);
Assert.Equal(1, a2_to_a1);
Assert.Equal(2, a3_to_a2);
Assert.Equal(3, a4_to_a3);
Assert.Equal(4, a5_to_a4);
Assert.Equal(5, a6_to_a5);
Assert.Equal(6, a7_to_a6);
Assert.Equal(0f, fa0);
Assert.Equal(1f, fa1);
Assert.Equal(2f, fa2);
Assert.Equal(3f, fa3);
Assert.Equal(4f, fa4);
Assert.Equal(5f, fa5);
Assert.Equal(6f, fa6);
Assert.Equal(EmptyFloatEmpty5Byte.Get(), stack0_stack1_to_fa7_a7);
Assert.Equal(7, stack2_to_stack0);
Assert.Equal(7f, fa7_to_stack1);
}
[Fact]
public static void Test_ShufflingThunk_EmptyFloatEmpty5Byte_RiscV()
{
var getDelegate = [MethodImpl(MethodImplOptions.NoOptimization)] ()
=> ShufflingThunk_EmptyFloatEmpty5Byte_RiscV;
getDelegate()(0, 1, 2, 3, 4, 5, 6, 0f, 1f, 2f, 3f, 4f, 5f, 6f,
EmptyFloatEmpty5Byte.Get(), 7, 7f);
}
[MethodImpl(MethodImplOptions.NoInlining)]
private static void ShufflingThunk_EmptyFloatEmpty5Sbyte_Empty8Float_RiscV(
int a1_to_a0, int a2_to_a1, int a3_to_a2, int a4_to_a3, int a5_to_a4, int a6_to_a5, int a7_to_a6,
float fa0, float fa1, float fa2, float fa3, float fa4, float fa5, float fa6,
EmptyFloatEmpty5Sbyte stack0_stack1_to_fa7_a7,
int stack2_to_stack0,
Empty8Float fa7_to_stack1_stack2)
{
Assert.Equal(0, a1_to_a0);
Assert.Equal(1, a2_to_a1);
Assert.Equal(2, a3_to_a2);
Assert.Equal(3, a4_to_a3);
Assert.Equal(4, a5_to_a4);
Assert.Equal(5, a6_to_a5);
Assert.Equal(6, a7_to_a6);
Assert.Equal(0f, fa0);
Assert.Equal(1f, fa1);
Assert.Equal(2f, fa2);
Assert.Equal(3f, fa3);
Assert.Equal(4f, fa4);
Assert.Equal(5f, fa5);
Assert.Equal(6f, fa6);
Assert.Equal(EmptyFloatEmpty5Sbyte.Get(), stack0_stack1_to_fa7_a7);
Assert.Equal(7, stack2_to_stack0);
Assert.Equal(Empty8Float.Get(), fa7_to_stack1_stack2);
}
[Fact]
public static void Test_ShufflingThunk_EmptyFloatEmpty5Sbyte_Empty8Float_RiscV()
{
var getDelegate = [MethodImpl(MethodImplOptions.NoOptimization)] ()
=> ShufflingThunk_EmptyFloatEmpty5Sbyte_Empty8Float_RiscV;
getDelegate()(0, 1, 2, 3, 4, 5, 6, 0f, 1f, 2f, 3f, 4f, 5f, 6f,
EmptyFloatEmpty5Sbyte.Get(), 7, Empty8Float.Get());
}
[MethodImpl(MethodImplOptions.NoInlining)]
private static void ShufflingThunk_EmptyUshortAndDouble_FloatEmpty8Float_Empty8Float_RiscV(
int a1_to_a0, int a2_to_a1, int a3_to_a2, int a4_to_a3, int a5_to_a4, int a6_to_a5, int a7_to_a6,
float fa0, float fa1, float fa2, float fa3, float fa4, float fa5,
EmptyUshortAndDouble stack0_stack1_to_a7_fa6, // 1st lowering
FloatEmpty8Float fa6_fa7_to_stack0_stack1, // delowering
Empty8Float stack1_stack2_to_fa7, // 2nd lowering
int stack3_to_stack2)
{
Assert.Equal(0, a1_to_a0);
Assert.Equal(1, a2_to_a1);
Assert.Equal(2, a3_to_a2);
Assert.Equal(3, a4_to_a3);
Assert.Equal(4, a5_to_a4);
Assert.Equal(5, a6_to_a5);
Assert.Equal(6, a7_to_a6);
Assert.Equal(0f, fa0);
Assert.Equal(1f, fa1);
Assert.Equal(2f, fa2);
Assert.Equal(3f, fa3);
Assert.Equal(4f, fa4);
Assert.Equal(5f, fa5);
Assert.Equal(EmptyUshortAndDouble.Get(), stack0_stack1_to_a7_fa6);
Assert.Equal(FloatEmpty8Float.Get(), fa6_fa7_to_stack0_stack1);
Assert.Equal(Empty8Float.Get(), stack1_stack2_to_fa7);
Assert.Equal(7, stack3_to_stack2);
}
[Fact]
public static void Test_ShufflingThunk_EmptyUshortAndDouble_FloatEmpty8Float_Empty8Float_RiscV()
{
var getDelegate = [MethodImpl(MethodImplOptions.NoOptimization)] ()
=> ShufflingThunk_EmptyUshortAndDouble_FloatEmpty8Float_Empty8Float_RiscV;
getDelegate()(0, 1, 2, 3, 4, 5, 6, 0f, 1f, 2f, 3f, 4f, 5f,
EmptyUshortAndDouble.Get(), FloatEmpty8Float.Get(), Empty8Float.Get(), 7);
}
[MethodImpl(MethodImplOptions.NoInlining)]
private static void ShufflingThunk_FloatEmptyShort_DoubleFloatNestedEmpty_Float_RiscV(
int a1_to_a0, int a2_to_a1, int a3_to_a2, int a4_to_a3, int a5_to_a4, int a6_to_a5, int a7_to_a6,
float fa0, float fa1, float fa2, float fa3, float fa4, float fa5,
FloatEmptyShort stack0_to_fa6_a7, // frees 1 stack slot
int stack1_to_stack0,
DoubleFloatNestedEmpty fa6_fa7_to_stack1_stack2, // takes 2 stack slots
int stack2_to_stack3, // shuffle stack slots to the right
int stack3_to_stack4,
Empty8Float stack4_stack5_to_fa7, // frees 2 stack slots
int stack6_to_stack5, // shuffle stack slots to the left
int stack7_to_stack6)
{
Assert.Equal(0, a1_to_a0);
Assert.Equal(1, a2_to_a1);
Assert.Equal(2, a3_to_a2);
Assert.Equal(3, a4_to_a3);
Assert.Equal(4, a5_to_a4);
Assert.Equal(5, a6_to_a5);
Assert.Equal(6, a7_to_a6);
Assert.Equal(0f, fa0);
Assert.Equal(1f, fa1);
Assert.Equal(2f, fa2);
Assert.Equal(3f, fa3);
Assert.Equal(4f, fa4);
Assert.Equal(5f, fa5);
Assert.Equal(FloatEmptyShort.Get(), stack0_to_fa6_a7);
Assert.Equal(7, stack1_to_stack0);
Assert.Equal(DoubleFloatNestedEmpty.Get(), fa6_fa7_to_stack1_stack2);
Assert.Equal(8, stack2_to_stack3);
Assert.Equal(9, stack3_to_stack4);
Assert.Equal(Empty8Float.Get(), stack4_stack5_to_fa7);
Assert.Equal(10, stack6_to_stack5);
Assert.Equal(11, stack7_to_stack6);
}
[Fact]
public static void Test_ShufflingThunk_FloatEmptyShort_DoubleFloatNestedEmpty_Float_RiscV()
{
var getDelegate = [MethodImpl(MethodImplOptions.NoOptimization)] ()
=> ShufflingThunk_FloatEmptyShort_DoubleFloatNestedEmpty_Float_RiscV;
getDelegate()(0, 1, 2, 3, 4, 5, 6, 0f, 1f, 2f, 3f, 4f, 5f,
FloatEmptyShort.Get(), 7, DoubleFloatNestedEmpty.Get(), 8, 9, Empty8Float.Get(), 10, 11);
}
public struct FloatFloat
{
public float Float0;
public float Float1;
public static FloatFloat Get()
=> new FloatFloat { Float0 = 2.71828f, Float1 = 1.61803f };
public override bool Equals(object other)
=> other is FloatFloat o && Float0 == o.Float0 && Float1 == o.Float1;
public override string ToString()
=> $"{{Float0:{Float0}, Float1:{Float1}}}";
}
[MethodImpl(MethodImplOptions.NoInlining)]
private static void ShufflingThunk_PackedEmptyFloatLong_FloatFloat_RiscV(
int a1_to_a0, int a2_to_a1, int a3_to_a2, int a4_to_a3, int a5_to_a4, int a6_to_a5, int a7_to_a6,
float fa0, float fa1, float fa2, float fa3, float fa4, float fa5,
PackedEmptyFloatLong stack0_stack1_to_fa7_a7,
int stack2_to_stack0,
FloatFloat fa6_fa7_to_stack1)
{
Assert.Equal(0, a1_to_a0);
Assert.Equal(1, a2_to_a1);
Assert.Equal(2, a3_to_a2);
Assert.Equal(3, a4_to_a3);
Assert.Equal(4, a5_to_a4);
Assert.Equal(5, a6_to_a5);
Assert.Equal(6, a7_to_a6);
Assert.Equal(0f, fa0);
Assert.Equal(1f, fa1);
Assert.Equal(2f, fa2);
Assert.Equal(3f, fa3);
Assert.Equal(4f, fa4);
Assert.Equal(5f, fa5);
Assert.Equal(PackedEmptyFloatLong.Get(), stack0_stack1_to_fa7_a7);
Assert.Equal(7, stack2_to_stack0);
Assert.Equal(FloatFloat.Get(), fa6_fa7_to_stack1);
}
[Fact]
public static void Test_ShufflingThunk_PackedEmptyFloatLong_FloatFloat_RiscV()
{
var getDelegate = [MethodImpl(MethodImplOptions.NoOptimization)] ()
=> ShufflingThunk_PackedEmptyFloatLong_FloatFloat_RiscV;
getDelegate()(0, 1, 2, 3, 4, 5, 6, 0f, 1f, 2f, 3f, 4f, 5f,
PackedEmptyFloatLong.Get(), 7, FloatFloat.Get());
}
[StructLayout(LayoutKind.Sequential, Pack=1)]
public struct PackedEmptyUintEmptyFloat
{
public Empty Empty0;
public uint Uint0;
public Empty Empty1;
public float Float0;
public static PackedEmptyUintEmptyFloat Get()
=> new PackedEmptyUintEmptyFloat { Uint0 = 0xB1ed0c1e, Float0 = 2.71828f };
public override bool Equals(object other)
=> other is PackedEmptyUintEmptyFloat o && Uint0 == o.Uint0 && Float0 == o.Float0;
public override string ToString()
=> $"{{Uint0:{Uint0}, Float0:{Float0}}}";
}
[StructLayout(LayoutKind.Sequential, Pack=1)]
public struct PackedEmptyDouble
{
public Empty Empty0;
public double Double0;
public static PackedEmptyDouble Get()
=> new PackedEmptyDouble { Double0 = 1.61803 };
public override bool Equals(object other)
=> other is PackedEmptyDouble o && Double0 == o.Double0;
public override string ToString()
=> $"{{Double0:{Double0}}}";
}
[MethodImpl(MethodImplOptions.NoInlining)]
private static void ShufflingThunk_PackedEmptyUintEmptyFloat_PackedEmptyDouble(
int a1_to_a0, int a2_to_a1, int a3_to_a2, int a4_to_a3, int a5_to_a4, int a6_to_a5, int a7_to_a6,
float fa0, float fa1,
PackedEmptyUintEmptyFloat stack0_stack1_to_a7_fa2,
float fa2_to_fa3, float fa3_to_fa4, float fa4_to_fa5,
int stack2_to_stack0,
float fa5_to_fa6, float fa6_to_fa7,
PackedEmptyDouble fa7_to_stack1_stack2)
{
Assert.Equal(0, a1_to_a0);
Assert.Equal(1, a2_to_a1);
Assert.Equal(2, a3_to_a2);
Assert.Equal(3, a4_to_a3);
Assert.Equal(4, a5_to_a4);
Assert.Equal(5, a6_to_a5);
Assert.Equal(6, a7_to_a6);
Assert.Equal(0f, fa0);
Assert.Equal(1f, fa1);
Assert.Equal(PackedEmptyUintEmptyFloat.Get(), stack0_stack1_to_a7_fa2);
Assert.Equal(2f, fa2_to_fa3);
Assert.Equal(3f, fa3_to_fa4);
Assert.Equal(4f, fa4_to_fa5);
Assert.Equal(7, stack2_to_stack0);
Assert.Equal(5f, fa5_to_fa6);
Assert.Equal(6f, fa6_to_fa7);
Assert.Equal(PackedEmptyDouble.Get(), fa7_to_stack1_stack2);
}
[Fact]
public static void Test_ShufflingThunk_PackedEmptyUintEmptyFloat_PackedEmptyDouble()
{
var getDelegate = [MethodImpl(MethodImplOptions.NoOptimization)] ()
=> ShufflingThunk_PackedEmptyUintEmptyFloat_PackedEmptyDouble;
getDelegate()(0, 1, 2, 3, 4, 5, 6, 0f, 1f,
PackedEmptyUintEmptyFloat.Get(), 2f, 3f, 4f, 7, 5f, 6f, PackedEmptyDouble.Get());
}
public struct FloatFloatEmpty
{
public FloatFloat FloatFloat0;
public Empty Empty0;
public static FloatFloatEmpty Get()
=> new FloatFloatEmpty { FloatFloat0 = FloatFloat.Get() };
public override bool Equals(object other)
=> other is FloatFloatEmpty o && FloatFloat0.Equals(o.FloatFloat0);
public override string ToString()
=> $"{{FloatFloat0:{FloatFloat0}}}";
}
public struct FloatEmpty8
{
public float Float0;
public Eight<Empty> EightEmpty0;
public static FloatEmpty8 Get()
=> new FloatEmpty8 { Float0 = 2.71828f };
public override bool Equals(object other)
=> other is FloatEmpty8 o && Float0 == o.Float0;
public override string ToString()
=> $"{{Float0:{Float0}}}";
}
[MethodImpl(MethodImplOptions.NoInlining)]
private static void ShufflingThunk_FloatEmptyShort_FloatFloatEmpty_FloatEmpty8(
int a1_to_a0, int a2_to_a1, int a3_to_a2, int a4_to_a3, int a5_to_a4, int a6_to_a5, int a7_to_a6,
float fa0, float fa1, float fa2, float fa3, float fa4, float fa5,
FloatEmptyShort stack0_to_fa6_a7, // frees 1 stack slot
int stack1_to_stack0,
FloatFloatEmpty fa6_fa7_to_stack1_stack2, // takes 2 stack slots
int stack2_to_stack3, // shuffle stack slots to the right
int stack3_to_stack4,
FloatEmpty8 stack4_stack5_to_fa7, // frees 2 stack slots
int stack6_to_stack5, // shuffle stack slots to the left
int stack7_to_stack6)
{
Assert.Equal(0, a1_to_a0);
Assert.Equal(1, a2_to_a1);
Assert.Equal(2, a3_to_a2);
Assert.Equal(3, a4_to_a3);
Assert.Equal(4, a5_to_a4);
Assert.Equal(5, a6_to_a5);
Assert.Equal(6, a7_to_a6);
Assert.Equal(0f, fa0);
Assert.Equal(1f, fa1);
Assert.Equal(2f, fa2);
Assert.Equal(3f, fa3);
Assert.Equal(4f, fa4);
Assert.Equal(5f, fa5);
Assert.Equal(FloatEmptyShort.Get(), stack0_to_fa6_a7);
Assert.Equal(7, stack1_to_stack0);
Assert.Equal(FloatFloatEmpty.Get(), fa6_fa7_to_stack1_stack2);
Assert.Equal(8, stack2_to_stack3);
Assert.Equal(9, stack3_to_stack4);
Assert.Equal(FloatEmpty8.Get(), stack4_stack5_to_fa7);
Assert.Equal(10, stack6_to_stack5);
Assert.Equal(11, stack7_to_stack6);
}
[Fact]
public static void Test_ShufflingThunk_FloatEmptyShort_FloatFloatEmpty_FloatEmpty8()
{
var getDelegate = [MethodImpl(MethodImplOptions.NoOptimization)] ()
=> ShufflingThunk_FloatEmptyShort_FloatFloatEmpty_FloatEmpty8;
getDelegate()(0, 1, 2, 3, 4, 5, 6, 0f, 1f, 2f, 3f, 4f, 5f,
FloatEmptyShort.Get(), 7, FloatFloatEmpty.Get(), 8, 9, FloatEmpty8.Get(), 10, 11);
}
[MethodImpl(MethodImplOptions.NoInlining)]
private static DoubleFloatNestedEmpty ShufflingThunk_FloatEmptyShort_DoubleFloatNestedEmpty_RiscV(
int a1_to_a0, int a2_to_a1, int a3_to_a2, int a4_to_a3, int a5_to_a4, int a6_to_a5, int a7_to_a6,
float fa0, float fa1, float fa2, float fa3, float fa4, float fa5,
FloatEmptyShort stack0_to_fa6_a7, // frees 1 stack slot
int stack1_to_stack0,
DoubleFloatNestedEmpty fa6_fa7_to_stack1_stack2, // takes 2 stack slots
int stack2_to_stack3, // shuffle stack slots to the right
int stack3_to_stack4) // shuffling thunk must grow the stack
{
Assert.Equal(0, a1_to_a0);
Assert.Equal(1, a2_to_a1);
Assert.Equal(2, a3_to_a2);
Assert.Equal(3, a4_to_a3);
Assert.Equal(4, a5_to_a4);
Assert.Equal(5, a6_to_a5);
Assert.Equal(6, a7_to_a6);
Assert.Equal(0f, fa0);
Assert.Equal(1f, fa1);
Assert.Equal(2f, fa2);
Assert.Equal(3f, fa3);
Assert.Equal(4f, fa4);
Assert.Equal(5f, fa5);
Assert.Equal(FloatEmptyShort.Get(), stack0_to_fa6_a7);
Assert.Equal(7, stack1_to_stack0);
Assert.Equal(DoubleFloatNestedEmpty.Get(), fa6_fa7_to_stack1_stack2);
Assert.Equal(8, stack2_to_stack3);
Assert.Equal(9, stack3_to_stack4);
return fa6_fa7_to_stack1_stack2;
}
[Fact]
public static void Test_ShufflingThunk_FloatEmptyShort_DoubleFloatNestedEmpty_RiscV()
{
var getDelegate = [MethodImpl(MethodImplOptions.NoOptimization)] ()
=> ShufflingThunk_FloatEmptyShort_DoubleFloatNestedEmpty_RiscV;
var delegat = getDelegate();
Span<int> stackBeforeCall = stackalloc[] {11, 22, 33, 44};
DoubleFloatNestedEmpty ret = delegat(0, 1, 2, 3, 4, 5, 6, 0f, 1f, 2f, 3f, 4f, 5f,
FloatEmptyShort.Get(), 7, DoubleFloatNestedEmpty.Get(), 8, 9);
Assert.Equal([11, 22, 33, 44], stackBeforeCall);
Assert.Equal(DoubleFloatNestedEmpty.Get(), ret);
}
class EverythingIsFineException : Exception {}
[MethodImpl(MethodImplOptions.NoInlining)]
private static void ShufflingThunk_FloatEmptyShort_Empty8Float_RiscV(
int a1_to_a0, int a2_to_a1, int a3_to_a2, int a4_to_a3, int a5_to_a4, int a6_to_a5, int a7_to_a6,
float fa0,
FloatEmptyShort stack0_to_fa1_a7, // frees 1 stack slot
double fa1_to_fa2,
double fa2_to_fa3,
byte stack1_to_stack0,
short stack2_to_stack1,
double fa3_to_fa4,
float fa4_to_fa5,
int stack3_to_stack2,
float fa5_to_fa6,
double fa6_to_fa7,
long stack4_to_stack3,
Empty8Float fa7_to_stack4_stack5) // takes 2 stack slots, shuffling thunk must grow the stack
{
Assert.Equal(0, a1_to_a0);
Assert.Equal(1, a2_to_a1);
Assert.Equal(2, a3_to_a2);
Assert.Equal(3, a4_to_a3);
Assert.Equal(4, a5_to_a4);
Assert.Equal(5, a6_to_a5);
Assert.Equal(6, a7_to_a6);
Assert.Equal(0f, fa0);
Assert.Equal(FloatEmptyShort.Get(), stack0_to_fa1_a7);
Assert.Equal(1d, fa1_to_fa2);
Assert.Equal(2d, fa2_to_fa3);
Assert.Equal(7, stack1_to_stack0);
Assert.Equal(8, stack2_to_stack1);
Assert.Equal(3d, fa3_to_fa4);
Assert.Equal(4f, fa4_to_fa5);
Assert.Equal(9, stack3_to_stack2);
Assert.Equal(5f, fa5_to_fa6);
Assert.Equal(6d, fa6_to_fa7);
Assert.Equal(10, stack4_to_stack3);
Assert.Equal(Empty8Float.Get(), fa7_to_stack4_stack5);
throw new EverythingIsFineException(); // see if we can walk out of the stack frame laid by the shuffle thunk
}
[Fact]
public static void Test_ShufflingThunk_FloatEmptyShort_Empty8Float_RiscV()
{
var getDelegate = [MethodImpl(MethodImplOptions.NoOptimization)] ()
=> ShufflingThunk_FloatEmptyShort_Empty8Float_RiscV;
var delegat = getDelegate();
Assert.Throws<EverythingIsFineException>(() =>
delegat(0, 1, 2, 3, 4, 5, 6, 0f,
FloatEmptyShort.Get(), 1d, 2d, 7, 8, 3d, 4f, 9, 5f, 6d, 10, Empty8Float.Get())
);
}
public struct UintFloat
{
public uint Uint0;
public float Float0;
public static UintFloat Get()
=> new UintFloat { Uint0 = 0xB1ed0c1e, Float0 = 2.71828f };
public override bool Equals(object other)
=> other is UintFloat o && Uint0 == o.Uint0 && Float0 == o.Float0;
public override string ToString()
=> $"{{Uint0:{Uint0}, Float0:{Float0}}}";
}
public struct LongDoubleInt
{
public long Long0;
public double Double0;
public int Int0;
public static LongDoubleInt Get()
=> new LongDoubleInt { Long0 = 0xDadAddedC0ffee, Double0 = 3.14159, Int0 = 0xBabc1a };
public override bool Equals(object other)
=> other is LongDoubleInt o && Long0 == o.Long0 && Double0 == o.Double0 && Int0 == o.Int0;
public override string ToString()
=> $"{{Long:{Long0}, Double0:{Double0}, Int0:{Int0}}}";
}
class ShufflingThunk_MemberGrowsStack_RiscV
{
public static ShufflingThunk_MemberGrowsStack_RiscV TestInstance =
new ShufflingThunk_MemberGrowsStack_RiscV();
public delegate FloatEmpty8Float TestDelegate(
ShufflingThunk_MemberGrowsStack_RiscV _this,
int a2_to_a1, int a3_to_a2, int a4_to_a3, int a5_to_a4, int a6_to_a5, int a7_to_a6,
float fa0, float fa1, float fa2, float fa3, float fa4, float fa5,
UintFloat stack0_to_a7_fa6, // frees 1 stack slot
DoubleFloatNestedEmpty fa6_fa7_to_stack0_stack1); // takes 2 stack slots, shuffling thunk must grow the stack
[MethodImpl(MethodImplOptions.NoInlining)]
public FloatEmpty8Float TestMethod(
int a2_to_a1, int a3_to_a2, int a4_to_a3, int a5_to_a4, int a6_to_a5, int a7_to_a6,
float fa0, float fa1, float fa2, float fa3, float fa4, float fa5,
UintFloat stack0_to_a7_fa6, // frees 1 stack slot
DoubleFloatNestedEmpty fa6_fa7_to_stack0_stack1) // takes 2 stack slots, shuffling thunk must grow the stack
{
return StaticTestMethod(this,
a2_to_a1, a3_to_a2, a4_to_a3, a5_to_a4, a6_to_a5, a7_to_a6,
fa0, fa1, fa2, fa3, fa4, fa5,
stack0_to_a7_fa6,
fa6_fa7_to_stack0_stack1);
}
public static FloatEmpty8Float StaticTestMethod(
ShufflingThunk_MemberGrowsStack_RiscV _this,
int a2_to_a1, int a3_to_a2, int a4_to_a3, int a5_to_a4, int a6_to_a5, int a7_to_a6,
float fa0, float fa1, float fa2, float fa3, float fa4, float fa5,
UintFloat stack0_to_a7_fa6, // frees 1 stack slot
DoubleFloatNestedEmpty fa6_fa7_to_stack0_stack1) // takes 2 stack slots, shuffling thunk must grow the stack
{
Assert.Equal(TestInstance, _this);
Assert.Equal(1, a2_to_a1);
Assert.Equal(2, a3_to_a2);
Assert.Equal(3, a4_to_a3);
Assert.Equal(4, a5_to_a4);
Assert.Equal(5, a6_to_a5);
Assert.Equal(6, a7_to_a6);
Assert.Equal(0f, fa0);
Assert.Equal(1f, fa1);
Assert.Equal(2f, fa2);
Assert.Equal(3f, fa3);
Assert.Equal(4f, fa4);
Assert.Equal(5f, fa5);
Assert.Equal(UintFloat.Get(), stack0_to_a7_fa6);
Assert.Equal(DoubleFloatNestedEmpty.Get(), fa6_fa7_to_stack0_stack1);
return FloatEmpty8Float.Get();
}
}
[Fact]
public static void Test_ShufflingThunk_MemberGrowsStack_RiscV()
{
var delegat = (ShufflingThunk_MemberGrowsStack_RiscV.TestDelegate)Delegate.CreateDelegate(
typeof(ShufflingThunk_MemberGrowsStack_RiscV.TestDelegate), null,
typeof(ShufflingThunk_MemberGrowsStack_RiscV).GetMethod(
nameof(ShufflingThunk_MemberGrowsStack_RiscV.TestMethod))
);
FloatEmpty8Float ret = delegat(ShufflingThunk_MemberGrowsStack_RiscV.TestInstance,
1, 2, 3, 4, 5, 6, 0f, 1f, 2f, 3f, 4f, 5f, UintFloat.Get(), DoubleFloatNestedEmpty.Get());
Assert.Equal(FloatEmpty8Float.Get(), ret);
var getStaticMethod = [MethodImpl(MethodImplOptions.NoOptimization)] ()
=> (ShufflingThunk_MemberGrowsStack_RiscV.TestDelegate)
ShufflingThunk_MemberGrowsStack_RiscV.StaticTestMethod;
delegat = getStaticMethod();
ret = delegat(ShufflingThunk_MemberGrowsStack_RiscV.TestInstance,
1, 2, 3, 4, 5, 6, 0f, 1f, 2f, 3f, 4f, 5f, UintFloat.Get(), DoubleFloatNestedEmpty.Get());
Assert.Equal(FloatEmpty8Float.Get(), ret);
}
class ShufflingThunk_MemberGrowsStack_ReturnBuffer_RiscV
{
public static ShufflingThunk_MemberGrowsStack_ReturnBuffer_RiscV TestInstance =
new ShufflingThunk_MemberGrowsStack_ReturnBuffer_RiscV();
public delegate LongDoubleInt TestDelegate(
ShufflingThunk_MemberGrowsStack_ReturnBuffer_RiscV _this,
// ReturnBuffer* a2_to_a0
int a3_to_a2, int a4_to_a3, int a5_to_a4, int a6_to_a5, int a7_to_a6,
float fa0, float fa1, float fa2, float fa3, float fa4, float fa5,
UintFloat stack0_to_a7_fa6, // frees 1 stack slot
DoubleFloatNestedEmpty fa6_fa7_to_stack0_stack1); // takes 2 stack slots, shuffling thunk must grow the stack
[MethodImpl(MethodImplOptions.NoInlining)]
public LongDoubleInt TestMethod(
// ReturnBuffer* a2_to_a0
int a3_to_a2, int a4_to_a3, int a5_to_a4, int a6_to_a5, int a7_to_a6,
float fa0, float fa1, float fa2, float fa3, float fa4, float fa5,
UintFloat stack0_to_a7_fa6, // frees 1 stack slot
DoubleFloatNestedEmpty fa6_fa7_to_stack0_stack1) // takes 2 stack slots, shuffling thunk must grow the stack
{
return StaticTestMethod(this,
a3_to_a2, a4_to_a3, a5_to_a4, a6_to_a5, a7_to_a6,
fa0, fa1, fa2, fa3, fa4, fa5,
stack0_to_a7_fa6,
fa6_fa7_to_stack0_stack1);
}
[MethodImpl(MethodImplOptions.NoInlining)]
public static LongDoubleInt StaticTestMethod(
ShufflingThunk_MemberGrowsStack_ReturnBuffer_RiscV _this,
// ReturnBuffer* a2_to_a0
int a3_to_a2, int a4_to_a3, int a5_to_a4, int a6_to_a5, int a7_to_a6,
float fa0, float fa1, float fa2, float fa3, float fa4, float fa5,
UintFloat stack0_to_a7_fa6, // frees 1 stack slot
DoubleFloatNestedEmpty fa6_fa7_to_stack0_stack1) // takes 2 stack slots, shuffling thunk must grow the stack
{
Assert.Equal(TestInstance, _this);
Assert.Equal(2, a3_to_a2);
Assert.Equal(3, a4_to_a3);
Assert.Equal(4, a5_to_a4);
Assert.Equal(5, a6_to_a5);
Assert.Equal(6, a7_to_a6);
Assert.Equal(0f, fa0);
Assert.Equal(1f, fa1);
Assert.Equal(2f, fa2);
Assert.Equal(3f, fa3);
Assert.Equal(4f, fa4);
Assert.Equal(5f, fa5);
Assert.Equal(UintFloat.Get(), stack0_to_a7_fa6);
Assert.Equal(DoubleFloatNestedEmpty.Get(), fa6_fa7_to_stack0_stack1);
return LongDoubleInt.Get(); // via return buffer
}
}
[Fact]
public static void Test_ShufflingThunk_MemberGrowsStack_ReturnBuffer_RiscV()
{
var delegat = (ShufflingThunk_MemberGrowsStack_ReturnBuffer_RiscV.TestDelegate)Delegate.CreateDelegate(
typeof(ShufflingThunk_MemberGrowsStack_ReturnBuffer_RiscV.TestDelegate), null,
typeof(ShufflingThunk_MemberGrowsStack_ReturnBuffer_RiscV).GetMethod(
nameof(ShufflingThunk_MemberGrowsStack_ReturnBuffer_RiscV.TestMethod))
);
LongDoubleInt ret = delegat(ShufflingThunk_MemberGrowsStack_ReturnBuffer_RiscV.TestInstance,
2, 3, 4, 5, 6, 0f, 1f, 2f, 3f, 4f, 5f, UintFloat.Get(), DoubleFloatNestedEmpty.Get());
Assert.Equal(LongDoubleInt.Get(), ret);
var getStaticMethod = [MethodImpl(MethodImplOptions.NoOptimization)] ()
=> (ShufflingThunk_MemberGrowsStack_ReturnBuffer_RiscV.TestDelegate)
ShufflingThunk_MemberGrowsStack_ReturnBuffer_RiscV.StaticTestMethod;
delegat = getStaticMethod();
ret = delegat(ShufflingThunk_MemberGrowsStack_ReturnBuffer_RiscV.TestInstance,
2, 3, 4, 5, 6, 0f, 1f, 2f, 3f, 4f, 5f, UintFloat.Get(), DoubleFloatNestedEmpty.Get());
Assert.Equal(LongDoubleInt.Get(), ret);
}
#endregion

rzikm pushed a commit to rzikm/dotnet-runtime that referenced this pull request Oct 11, 2024
…thunks (dotnet#107282)

* Log ShuffleEntries from GenerateShuffleArray

* Add failing tests for passing FP structs through shuffling thunks

* Report proper ShuffleEntries for lowered FP structs

* Implement shuffling thunk generation in tighter, more focused loops instead of an omnibus loop handling ShuffleEntries

* Generate ShuffleEntries for delowered arguments

* Better ShuffleEntry mask names, one more bit for field offset

* Fix FpStruct for dst arg loc

* Fold ShuffleEntry generation code for lowering and delowering FpStructs

* ShuffleEntry mask doc update

* Implement forward shuffling of floating registers and delowering of FpStructs. EmptyStructs test passes except for ShufflingThunk_FloatEmptyShort_DoubleFloatNestedEmpty_RiscV

* Fix shuffling of integer registers for member functions

* The delowered argument can also be put in the first stack slot

* Stask shuffling after delowered argument doesn't start with 0. Fixes Regressions/coreclr/GitHub_16833/Test16833

* Code cleanup, fewer indents

* Support second lowering

* Remove unused CondCode

* Handle stack slot shuffling to the right

* Add some stack slots to shuffle in the growing stack test case

* Fix Equals signature on test structs

* Remodel the shuffling with calling convention transfer to recognize the key points first, which simplifies code and solves some corner cases e.g. where we can't assume struct stack size by checking the size + offset of the last field

* Use helper functions in EmitComputedInstantiatingMethodStub

* Implement stack growing in shuffling thunk

* Use signed immediate in EmitSubImm to be consistent with EmitAddImm

* Use ABI register names in logs

* Remove LoadRegPair because it's not used

* Add logging for slli and lui

* Remove stack growing from hand-emitted shuffle thunks

* Minor FloatFloatEmpty test cleanup

* Implement IL shuffling thunks for cases where the stack is growing

* Test stack walking in frames laid by the IL shuffle thunk

* Add assert and comment in CreateILDelegateShuffleThunk

* Fix release build

* Fixes for static delegates from member methods

* Fix log and comment

* Remove EmitJumpAndLinkRegister because it's no longer used

* Use TransferredField.reg in delowering (cosmetic fix to restart CI)

* New stub type for delegate shuffle thunk so it doesn't go in multidelegate code paths

* Make Test_ShufflingThunk_MemberGrowsStack_RiscV harder by returning via buffer

* Explain lowering

* Fold 12-bit sign extension branch in EmitMovConstant

* Harden Test_ShufflingThunk_PackedEmptyUintEmptyFloat_PackedEmptyDouble to cover interleaving FP and int arguments

* Handle shuffles between calling conventions in IL stubs

* Update comments

* Don't use NewStub for IL stubs

* Fold some more duplicated code into SetupShuffleThunk

* Clean up unnecessary diffs

* IL shuffle thunk takes target function address from delegate object. Cache the generated thunk on DelegateEEClass

* Build target signature based on delegate signature instead of just using the signature from target method to retain type context

* Test calling instance and static methods via the same delegate type

* Simplify shuffle thunk caching on DelegateEEClass

* Clean up CreateILDelegateShuffleThunk

* Delete Windows X86 stack size check

* Remove #ifdefs around ILSTUB_DELEGATE_SHUFFLE_THUNK, fix typo in GetStubMethodName

* Fix DecRef'ing path when the IL thunk is already cached on DelegateEEClass

* Fix shuffle thunk destruction in EEClass::Destruct: properly handle IL shuffle thunks and call RemoveStubRange if m_pInstRetBuffCallStub was deleted

* Don't use RemoveStubRange in the destructor, make code for dereferencing shuffle thunk when caching fails the same as destructor

* Remove unused RemoveStubRange

* Cover IL shuffle thunks in ILStubManager::TraceManager

* Remove unused start, end arguments from RangeList::RemoveRanges[Worker]

* Update src/coreclr/vm/comdelegate.cpp

---------

Co-authored-by: Jan Kotas <jkotas@microsoft.com>
@LuckyXu-HF
Copy link
Contributor

Hi @tomeksowi , could you please help to share the Prolog of MethodHash=88c18db0 with DOTNET_TieredCompilation=0 of EmptyStructs.sh under Debug mode on RISCV64? Thanks very much.

@tomeksowi
Copy link
Contributor Author

Hi @tomeksowi , could you please help to share the Prolog of MethodHash=88c18db0 with DOTNET_TieredCompilation=0 of EmptyStructs.sh under Debug mode on RISCV64? Thanks very much.

G_M29263_IG01:        ; offs=0x000000, size=0x004C, bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, nogc <-- Prolog IG
IN026a: 000000      addi           sp, sp, -80
IN026b: 000004      sd             fp, 16(sp)
IN026c: 000008      sd             ra, 24(sp)
IN026d: 00000C      sd             s1, 32(sp)
IN026e: 000010      sd             s2, 40(sp)
IN026f: 000014      sd             s3, 48(sp)
IN0270: 000018      sd             s4, 56(sp)
IN0271: 00001C      sd             s5, 64(sp)
IN0272: 000020      addi           fp, sp, 16
IN0273: 000024      addi           t0, sp, 80
IN0274: 000028      sd             t0, 56(fp)
IN0275: 00002C      fsw            f10, -8(fp)
IN0276: 000030      sh             a1, -2(fp)
IN0277: 000034      fsw            f11, -16(fp)
IN0278: 000038      sh             a3, -10(fp)
IN0279: 00003C      mv             s1, a0
                             ; gcrRegs +[s1]
IN027a: 000040      mv             s2, a2
                             ; gcrRegs +[s2]
IN027b: 000044      mv             s4, a4
                             ; gcrRegs +[s4]
IN027c: 000048      mv             s3, a5
                             ; byrRegs +[s3]

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-riscv Related to the RISC-V architecture area-VM-coreclr community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants