-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISC-V] Transfer arguments between calling conventions in shuffling thunks #107282
Conversation
…nstead of an omnibus loop handling ShuffleEntries
…pStructs. EmptyStructs test passes except for ShufflingThunk_FloatEmptyShort_DoubleFloatNestedEmpty_RiscV
…Regressions/coreclr/GitHub_16833/Test16833
…he key points first, which simplifies code and solves some corner cases e.g. where we can't assume struct stack size by checking the size + offset of the last field
…ing shuffle thunk when caching fails the same as destructor
7fe3c4e is being scheduled for building and testingGIT: Release-CLR-build FAILEDCompilation failed during coreclr-tests build```
</details><details>
<summary>Release-CLR-build FAILED</summary>
[buildinfo.json](https://gist.githubusercontent.com/risc-vv/6411da95b7ae685527b4b3b006589c09/raw/5b5e8e923238ab703922c2d51656bee8b3df98ed/7fe3c4e2_buildinfo.json)
```bash
Compilation failed during coreclr-tests build```
</details> |
659a9e4 is being scheduled for building and testingGIT: |
ccf1e4e is being scheduled for building and testingGIT: |
Not sure why it has started to fail just now, but the fix is to update this to 8.0.5 Line 139 in abde3f9
|
e45db16 is being scheduled for building and testingGIT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
@LuckyXu-HF @shushanhf LoongArch also needs this. |
Thanks! Regarding the preference for IL thunks in certain contexts vs. hand-written assembly in others; is this decision typically driven by specific performance characteristics or other context-dependent factors? Or is there a broader goal to gradually move away from hand-rolled assembly in the coreclr/vm? |
Yes, most of the remaining hand-generated assembly thunks exist in CoreCLR for performance reasons, both startup performance and throughput. Some of the performance reasons may be historic. For example, |
OK, Thanks |
Thank you! We will invest in this PR later. |
I think it'll be pretty simple, most of it will be just enabling RISC-V #ifdefs for LoongArch and trimming down StubLinkerCPU::EmitShuffleThunk like on RISC-V, I think you already have the necessary Emit* methods implemented.
I included them in this PR runtime/src/tests/JIT/Directed/StructABI/EmptyStructs.cs Lines 1585 to 2232 in e32148a
|
…thunks (dotnet#107282) * Log ShuffleEntries from GenerateShuffleArray * Add failing tests for passing FP structs through shuffling thunks * Report proper ShuffleEntries for lowered FP structs * Implement shuffling thunk generation in tighter, more focused loops instead of an omnibus loop handling ShuffleEntries * Generate ShuffleEntries for delowered arguments * Better ShuffleEntry mask names, one more bit for field offset * Fix FpStruct for dst arg loc * Fold ShuffleEntry generation code for lowering and delowering FpStructs * ShuffleEntry mask doc update * Implement forward shuffling of floating registers and delowering of FpStructs. EmptyStructs test passes except for ShufflingThunk_FloatEmptyShort_DoubleFloatNestedEmpty_RiscV * Fix shuffling of integer registers for member functions * The delowered argument can also be put in the first stack slot * Stask shuffling after delowered argument doesn't start with 0. Fixes Regressions/coreclr/GitHub_16833/Test16833 * Code cleanup, fewer indents * Support second lowering * Remove unused CondCode * Handle stack slot shuffling to the right * Add some stack slots to shuffle in the growing stack test case * Fix Equals signature on test structs * Remodel the shuffling with calling convention transfer to recognize the key points first, which simplifies code and solves some corner cases e.g. where we can't assume struct stack size by checking the size + offset of the last field * Use helper functions in EmitComputedInstantiatingMethodStub * Implement stack growing in shuffling thunk * Use signed immediate in EmitSubImm to be consistent with EmitAddImm * Use ABI register names in logs * Remove LoadRegPair because it's not used * Add logging for slli and lui * Remove stack growing from hand-emitted shuffle thunks * Minor FloatFloatEmpty test cleanup * Implement IL shuffling thunks for cases where the stack is growing * Test stack walking in frames laid by the IL shuffle thunk * Add assert and comment in CreateILDelegateShuffleThunk * Fix release build * Fixes for static delegates from member methods * Fix log and comment * Remove EmitJumpAndLinkRegister because it's no longer used * Use TransferredField.reg in delowering (cosmetic fix to restart CI) * New stub type for delegate shuffle thunk so it doesn't go in multidelegate code paths * Make Test_ShufflingThunk_MemberGrowsStack_RiscV harder by returning via buffer * Explain lowering * Fold 12-bit sign extension branch in EmitMovConstant * Harden Test_ShufflingThunk_PackedEmptyUintEmptyFloat_PackedEmptyDouble to cover interleaving FP and int arguments * Handle shuffles between calling conventions in IL stubs * Update comments * Don't use NewStub for IL stubs * Fold some more duplicated code into SetupShuffleThunk * Clean up unnecessary diffs * IL shuffle thunk takes target function address from delegate object. Cache the generated thunk on DelegateEEClass * Build target signature based on delegate signature instead of just using the signature from target method to retain type context * Test calling instance and static methods via the same delegate type * Simplify shuffle thunk caching on DelegateEEClass * Clean up CreateILDelegateShuffleThunk * Delete Windows X86 stack size check * Remove #ifdefs around ILSTUB_DELEGATE_SHUFFLE_THUNK, fix typo in GetStubMethodName * Fix DecRef'ing path when the IL thunk is already cached on DelegateEEClass * Fix shuffle thunk destruction in EEClass::Destruct: properly handle IL shuffle thunks and call RemoveStubRange if m_pInstRetBuffCallStub was deleted * Don't use RemoveStubRange in the destructor, make code for dereferencing shuffle thunk when caching fails the same as destructor * Remove unused RemoveStubRange * Cover IL shuffle thunks in ILStubManager::TraceManager * Remove unused start, end arguments from RangeList::RemoveRanges[Worker] * Update src/coreclr/vm/comdelegate.cpp --------- Co-authored-by: Jan Kotas <jkotas@microsoft.com>
Hi @tomeksowi , could you please help to share the Prolog of |
|
Properly implements the transfer of arguments between integer and hardware floating-point calling conventions (i.e. lowering and delowering) in shuffling thunks. The hitherto implementation was lacking in several ways:
float
from stack to register, which is wrong because the stack slot is not NaN-boxedThe new implementation does away with the omnibus loop in StubLinkerCPU::EmitShuffleThunk in favor of tighter case-by-case loops. That way shuffling all within integer calling convention (vast majority of cases) is simple and the more involved cases with calling convention transfers are complete while still not needing all-out graph sorting.
Note: this PR is about correctness, optimization TODOs will not be pursued here.
Stems from #101796, part of #84834, cc @dotnet/samsung