Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove uses of DECODE_RETURN_KIND part of GCInfo #110799

Merged
merged 11 commits into from
Dec 19, 2024
Merged

Conversation

VSadov
Copy link
Member

@VSadov VSadov commented Dec 18, 2024

A follow-up to the GC info changes in 9.0.

Since the call sites now have complete liveness info, the part about return value kind[s] recorded in GC info is redundant.
Evidently, with CET enabled, hijacking already has no need for that.

This change removes all the uses of DECODE_RETURN_KIND in hijacking scenarios.

NOTE: The encoder still populates this part of GC info for backward-compat reasons. Decoder skips it.
Once we move forward the minimum R2R version, we can remove the return kind bits from the GC info format. This will save a few bits of GC info for each method and will allow more methods to use "slim" format.

X86: stays on the old plan.

Copy link
Contributor

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

// return values.
virtual unsigned GetFrameAttribs() {
LIMITED_METHOD_DAC_CONTRACT;
return FRAME_ATTR_RESUMABLE; // Treat the next frame as the top frame.
Copy link
Member Author

@VSadov VSadov Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key change.
On non-x86 we can treat a thread self-interrupted synchronously by a hijack the same as a thread interrupted asynchronously by a signal. The difference is the size of the captured context and who captured it.

In theory the hijack could capture the entire context as the OS interrupt mechanisms do, but that would be fairly redundant, so hijack captures only what we could possibly need.

There is a small catch here. If in the future we need to report more registers (say some ABI is extended to return up to 3 GC pointers), we would need to adjust OnHijackTripThread and HijackFrame::UpdateRegDisplay accordingly. If hijack captured the entire context as OS interrupts do, we would not have to adjust. Things would "just work".

I thought about this and, considering that a good portion of context cannot contain GC pointers at call site, not capturing the entire context seems the right approach.

@@ -117,7 +112,7 @@ NESTED_ENTRY RhpGcProbeHijack, _TEXT, NoHandler
ret

LOCAL_LABEL(WaitForGC):
or ecx, DEFAULT_FRAME_SAVE_FLAGS + PTFF_SAVE_RAX + PTFF_SAVE_RDX
mov ecx, DEFAULT_FRAME_SAVE_FLAGS + PTFF_SAVE_RAX + PTFF_SAVE_RDX + PTFF_THREAD_HIJACK
Copy link
Member Author

@VSadov VSadov Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PTFF_THREAD_HIJACK is the NativeAOT couterpart to the HijackFrame.
This is basically a way to mark hijacked frames as leaf/active/top frames.
(as opposed to regular transition frames, which are not leaf frames)

@VSadov VSadov requested a review from jkotas December 19, 2024 03:09
#if defined(TARGET_ARM) || defined(TARGET_ARM64) || defined(TARGET_LOONGARCH64) || defined(TARGET_RISCV64)
DECODE_HAS_TAILCALLS = 0x4000,
DECODE_HAS_TAILCALLS = 0x2000,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these flags stay around for compat with SOS (see the comment at the top of the file)?

Copy link
Member Author

@VSadov VSadov Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we change the format it will be a breaking change. Diagnostics repo seems to take a copy of the decoder once that happens.
Otherwise, the flag is just for decoder API, it should not affect anything that does not use the new decoder.

I think this is ok to change, but we can use the old value to be sure. This is an int32, so no shortage of bits.

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@VSadov
Copy link
Member Author

VSadov commented Dec 19, 2024

Thanks!!

@VSadov VSadov merged commit ddc7867 into dotnet:main Dec 19, 2024
90 checks passed
@VSadov VSadov deleted the HijackFrame branch December 19, 2024 06:37
@VSadov
Copy link
Member Author

VSadov commented Dec 19, 2024

@jakobbotsch - once this gets merged with runtime async code, we will no longer need to do anything special for the continuation return in HijackFrame::GcScanRoots. However HijackFrame::UpdateRegDisplay will need to be adjusted to include context pointer to AsyncRet register in cases where it is not already included in the set of possible GC returns.

@jakobbotsch
Copy link
Member

@VSadov Nice! Am I also correct in assuming that we'll need to adjust the GC info encoder to also encode the fact that there is a new register that can contain GC refs after a call returns?

@VSadov
Copy link
Member Author

VSadov commented Dec 19, 2024

Am I also correct in assuming that we'll need to adjust the GC info encoder to also encode the fact that there is a new register that can contain GC refs after a call returns?

The encoder records everything that is live and contains GC refs, so we should not need to do anything special.
From GC info point of view hijacking is no different from asynchronously stopping a thread at call site (or, more precisely, at return IP), which should already work and already needs to know about all live GC refs at that location.

@jakobbotsch
Copy link
Member

jakobbotsch commented Dec 19, 2024

Am I also correct in assuming that we'll need to adjust the GC info encoder to also encode the fact that there is a new register that can contain GC refs after a call returns?

The encoder records everything that is live and contains GC refs, so we should not need to do anything special. From GC info point of view hijacking is no different from asynchronously stopping a thread at call site (or, more precisely, at return IP), which should already work and already needs to know about all live GC refs at that location.

So we will not need to make changes in e.g. GcInfoEncoder::DoNotTrackInPartiallyInterruptible?

Comment on lines +137 to +138
lu12i.w $t3, ((DEFAULT_FRAME_SAVE_FLAGS + PTFF_SAVE_R4 + PTFF_SAVE_R5 + PTFF_THREAD_HIJACK) >> 12) & 0xfffff
ori $t3, $t3, (DEFAULT_FRAME_SAVE_FLAGS + PTFF_SAVE_R4 + PTFF_SAVE_R5 + PTFF_THREAD_HIJACK) & 0xfff
Copy link
Member

@am11 am11 Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should PTFF_THREAD_HIJACK be PTFF_THREAD_HIJACK_HI? It should be giving a compiler error because PTFF_THREAD_HIJACK is undefined.
Noticed in #110688 when I was resolving the merge conflicts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this should be PTFF_THREAD_HIJACK_HI or PTFF_THREAD_HIJACK needs to be defined. Either way should work. It would be a trivial fix.

I'd be more worried whether this runs correctly. it should, but I had no way to verify.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shushanhf please check if the main branch build is fixed after changing it to PTFF_THREAD_HIJACK_HI. In riscv64, I have just used PTFF_THREAD_HIJACK_HI, but haven't tested the nativeaot implementation due to other issues we have in that port at the moment.

@VSadov
Copy link
Member Author

VSadov commented Dec 20, 2024

So we will not need to make changes in e.g. GcInfoEncoder::DoNotTrackInPartiallyInterruptible?

@jakobbotsch Good point - DoNotTrackInPartiallyInterruptible is another place where we filter out what cannot contain object refs at return site. We do not have to do that filtering. If something does not contain a GC ref it will not end up in GC info. The filtering is an early-out optimization, and I am not sure it saves a lot, but there is filtering.

Yes, we need to be careful to be sure it does not filter out AsyncRet.

This is the same concern as with HijackFrame::UpdateRegDisplay. Need to be sure that we do not optimize/skip something that actually may contain a GC ref.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants