Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Keep delegate object alive during invoke #105099

Closed
wants to merge 34 commits into from

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Jul 18, 2024

A quick fix for #105082, we just insert a KeepAlive node for the delegate object, diffWithGC for the snippet in the issue:

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 18, 2024
@EgorBo EgorBo changed the title Keep this alive fptr JIT: Keep delegate object alive during invoke Jul 18, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

// Keep delegate alive
GenTree* baseClone = comp->gtCloneExpr(base);
GenTree* keepBaseAlive = comp->gtNewKeepAliveNode(baseClone);
BlockRange().InsertAfter(call, baseClone, keepBaseAlive);
Copy link
Member

@jkotas jkotas Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this fix going to introduce CQ regressions? Ideally, the fix for this issue would be zero codegen diff on x86/x64.

We need to keep delegate alive only until we make the call. The delegate does not need to be alive after call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my knowledge, will all the various stubs that we may go through on the way to the final target automatically keep the right loader allocator without the JIT at minimum putting the delegate instance in a callee-saved register?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. There are really only 3:

  • Shuffle thunks: hand-emitted assembly so the GC cannot kick in while they are executing
  • Multicast stubs: they keep the delegate thisptr alive. (Adding comment that this is important may be nice.)
  • Marshalled function pointers: The delegate points to itself, so calling the method on the delegate will keep it alive.

It is possible that I may be missing a corner case somewhere, but I do not expect it to be hard to fix.

{
GenTree* baseClone = comp->gtCloneExpr(base);
GenTree* keepBaseAlive = comp->gtNewKeepAliveNode(baseClone);
BlockRange().InsertBefore(call, baseClone, keepBaseAlive);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This InsertBefore only works due to implementation details of the backend (lazy liveness updates for GC information about registers), which is quite subtle. But of course the fully ideal solution requires adding a new operand to GenTreeCall and special casing it throughout the backend as an invisible use, which is probably a bit overkill.

I wonder if we in the uncontained case wouldn't be better off just by inserting GT_START_NONGC and creating a nogc region on the single call instruction. Given that we already have a call and hence a GC safe point this does not seem like it would be very harmful; in a less conservative JIT this basic block might have simply always been partially interruptible for that reason.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm.. interesting idea! applied

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new diff for the snippet in the issue: https://www.diffchecker.com/CiuxVSBL/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you will also need to add a STOP_NOGC to turn it off again. Looks like we don't have that today.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you will also need to add a STOP_NOGC to turn it off again. Looks like we don't have that today.

I guess it's not from a correctness stand point, just to reduce the nongc area if we have a lot more statements in the current block?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this one is tricky - we might end whatever was expected to be nongc even before we emitted the start_nogc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's counted, so it doesn't turn off nogc until the count gets to zero. But regardless I wouldn't expect there to ever be another active nogc region in this case

Copy link
Member

@jakobbotsch jakobbotsch Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's not from a correctness stand point, just to reduce the nongc area if we have a lot more statements in the current block?

I'm not really sure how the VM behaves if the return address is in a nogc region – but in any case it's unnecessary. I think we only need it to be active from the load of the target address (which is the last use) to the call – should be just one instruction in the end

Comment on lines 5564 to 5577
#if !defined(TARGET_XARCH)
if (comp->GetInterruptible())
{
// If the target's backend doesn't support indirect calls with immediate operands (contained)
// and the method is marked as interruptible, we need to insert a GT_START_NONGC before the call.
// to keep the delegate object alive while we're obtaining the function pointer.
GenTree* startNonGCNode = new (comp, GT_START_NONGC) GenTree(GT_START_NONGC, TYP_VOID);
BlockRange().InsertAfter(thisArgNode, startNonGCNode);

// We should try to end the non-GC region just before the actual call.
*shouldEndNogcBeforeCall = true;
}
#endif

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be cleaner to insert both START and STOP at the end of LowerCall instead of doing half in here and half in LowerCall.

Also, is inserting the stop node before the call node really correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is inserting the stop node before the call node really correct?

Why not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GC is illegal when we are on the blr instruction -- it is not illegal when we are on the ldr instruction. https://www.diffchecker.com/NwHIxdIA/ looks like it makes the GC illegal on the ldr instruction but not on the blr instruction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can GC happen on call instructions before we entered the call?
If I need to extend the nogc past the call then it's easier for my PR, I just though that I should not add a call to nogc block (because GC may happens inside the call)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can GC happen on call instructions before we entered the call?

Yes, I think that's the problem here. If GC happens on that call instruction nothing indicates the delegate is live and thus the target we are going to jump to may be collected.

I just though that I should not add a call to nogc block (because GC may happens inside the call)

Hmm, I'm not totally sure what happens, but the safepoint should logically be on the return address, not on the call, so I think we should be fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just wondering if we have any debug mechanism that checks that all calls inside nogc blocks can't trigger GC even internally, but I presume we don't otherwise we'd catch the recent bug with BULKBARRIER earlier

Comment on lines 5554 to 5565
#if !defined(TARGET_XARCH)
if (comp->GetInterruptible())
{
// If the target's backend doesn't support indirect calls with immediate operands (contained)
// and the method is marked as interruptible, we need to insert a GT_START_NONGC before the call.
// to keep the delegate object alive while we're obtaining the function pointer.
GenTree* startNonGCNode = new (comp, GT_START_NONGC) GenTree(GT_START_NONGC, TYP_VOID);
GenTree* stopNonGCNode = new (comp, GT_STOP_NONGC) GenTree(GT_STOP_NONGC, TYP_VOID);
BlockRange().InsertAfter(thisArgNode, startNonGCNode);
BlockRange().InsertAfter(call, stopNonGCNode);
}
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still do this inside LowerCall since it's LowerCall responsibility to insert the control expression returned by this function. This is adding an assumption about where LowerCall will insert that node.

Copy link
Member Author

@EgorBo EgorBo Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still do this inside LowerCall since it's LowerCall responsibility to insert the control expression returned by this function. This is adding an assumption about where LowerCall will insert that node.

but that's the point, no? LowerDelegateInvoke already inserts some trees and at the end of those we conservatively emit a START_NOGC so other functions like LowerCall itself, LowerCFGCall or LowerTailCallViaJitHelper can insert whatever they want after our NOGC and before the call. Because, presumably, they can extend the dangerous area where GC can kick in and collect our delegate

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yeah, that's a good point too. What I dislike here is just that it still relies on LowerCall to insert the control expression after the START_NONGC. To be conservatively correct we have to insert the START_NOGC before the last use of the delegate, since emitDisableGC() means something like "the next instruction I emit is going to make GC information wrong, so disable GC after that instruction". Hence inserting START_NONGC after thisArgNode is only correct because there ends up being another use inserted after that point by LowerCall. The main idea behind moving the logic into LowerCall is that it has the full overview of what nodes it is going to insert and where, so it can make the decision with more information. So for instance it could insert it before the gtControlExpr in the delegate case, knowing that gtControlExpr has the last use of the delegate instance in the common case.

Copy link
Contributor

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

@github-actions github-actions bot locked and limited conversation to collaborators Oct 2, 2024
@EgorBo EgorBo reopened this Oct 10, 2024
@EgorBo
Copy link
Member Author

EgorBo commented Oct 10, 2024

/azp list

@EgorBo
Copy link
Member Author

EgorBo commented Oct 10, 2024

/azp run runtime-coreclr outerloop, runtime-coreclr jitstress

@EgorBo
Copy link
Member Author

EgorBo commented Oct 10, 2024

/azp run runtime-coreclr jitstress

@EgorBo
Copy link
Member Author

EgorBo commented Oct 10, 2024

/azp run runtime-coreclr outerloop

@EgorBo
Copy link
Member Author

EgorBo commented Nov 4, 2024

/azp run runtime-coreclr gcstress-extra, runtime-coreclr gcstress0x3-gcstress0xc

@EgorBo
Copy link
Member Author

EgorBo commented Nov 19, 2024

/azp run runtime-coreclr gcstress-extra, runtime-coreclr gcstress0x3-gcstress0xc

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants