Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: fix issue with out of order importer spilling around some calls #95539

Merged
merged 1 commit into from
Dec 4, 2023

Conversation

AndyAyersMS
Copy link
Member

@AndyAyersMS AndyAyersMS commented Dec 2, 2023

If we have a call that returns a struct that is immediately assigned to some local, the call is a GDV inline candidate, and the call is invoked with a copy of the same local on the evaluation stack, the spill of that local into a temp will appear in the IR stream between the call and the ret expr, instead of before the call.

As part of our IR resolution the store to the local gets "sunk" into the call as the hidden return buffer, so the importer ordering is manifestly incorrect:

call &retbuf, ...
tmp = retbuf
...ret-expr
...tmp

For normal inline candidates this mis-ordering gets fixed up either by swapping the call back into the ret expr's position, or for successful inlines by swapping the return value store into the ret expr's position. The JIT has behaved this way for a very long time, and the transient mis-ordering has not lead to any noticble problem.

For GDV calls the return value stores are done earlier, just after the call, and so the spill picks up the wrong value. GDV calls normally only happen with PGO data. This persistent mis-ordering has been the behavior since at least 6.0, but now that PGO is enabled by default a much wider set of programs are at risk of running into it.

The fix here is to reorder the IR in the importer at the point the store to the local is appended to the IR stream, as we are processing spills for the local. If the source of the store is a GT_RET_EXPR we keep track of these spills, find the associated GT_CALL statement, and move the spills before the call.

There was a similar fix made for boxes in #60355, where once again the splitting of the inline candidate call and the subsequent modification of the call to write directly to the return buffer local led to similar problems with GDV calls.

Fixes #95394.

If we have a call that returns a struct that is immediately assigned
to some local, the call is a GDV inline candidate, and the call is
invoked with a copy of the same local on the evaluation stack, the
spill of that local into a temp will appear in the IR stream between
the call and the ret expr, instead of before the call.

As part of our IR resolution the store to the local gets "sunk" into
the call as the hidden return buffer, so the importer ordering is
manifestly incorrect:
```
call &retbuf, ...
tmp = retbuf
...ret-expr
...tmp
```

For normal inline candidates this mis-ordering gets fixed up either
by swapping the call back into the ret expr's position, or for successful
inlines by swapping the return value store into the ret expr's position.
The JIT has behaved this way for a very long time, and the transient
mis-ordering has not lead to any noticble problem.

For GDV calls the return value stores are done earlier, just after
the call, and so the spill picks up the wrong value. GDV calls normally
only happen with PGO data. This persistent mis-ordering has been
the behavior since at least 6.0, but now that PGO is enabled by default
a much wider set of programs are at risk of running into it.

The fix here is to reorder the IR in the importer at the point the
store to the local is appended to the IR stream, as we are processing
spills for the local. If the source of the store is a GT_RET_EXPR we
keep track of these spills, find the associated GT_CALL statement, and
move the spills before the call.

There was a similar fix made for boxes in dotnet#60335, where once again the
splitting of the inline candidate call and the subsequent modification of
the call to write directly to the return buffer local led to similar
problems with GDV calls.

Fixes dotnet#95394.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Dec 2, 2023
@ghost ghost assigned AndyAyersMS Dec 2, 2023
@ghost
Copy link

ghost commented Dec 2, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

If we have a call that returns a struct that is immediately assigned to some local, the call is a GDV inline candidate, and the call is invoked with a copy of the same local on the evaluation stack, the spill of that local into a temp will appear in the IR stream between the call and the ret expr, instead of before the call.

As part of our IR resolution the store to the local gets "sunk" into the call as the hidden return buffer, so the importer ordering is manifestly incorrect:

call &retbuf, ...
tmp = retbuf
...ret-expr
...tmp

For normal inline candidates this mis-ordering gets fixed up either by swapping the call back into the ret expr's position, or for successful inlines by swapping the return value store into the ret expr's position. The JIT has behaved this way for a very long time, and the transient mis-ordering has not lead to any noticble problem.

For GDV calls the return value stores are done earlier, just after the call, and so the spill picks up the wrong value. GDV calls normally only happen with PGO data. This persistent mis-ordering has been the behavior since at least 6.0, but now that PGO is enabled by default a much wider set of programs are at risk of running into it.

The fix here is to reorder the IR in the importer at the point the store to the local is appended to the IR stream, as we are processing spills for the local. If the source of the store is a GT_RET_EXPR we keep track of these spills, find the associated GT_CALL statement, and move the spills before the call.

There was a similar fix made for boxes in #60335, where once again the splitting of the inline candidate call and the subsequent modification of the call to write directly to the return buffer local led to similar problems with GDV calls.

Fixes #95394.

Author: AndyAyersMS
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@AndyAyersMS
Copy link
Member Author

@EgorBo PTAL
cc @dotnet/jit-contrib

No diffs locally, verified the repro fails w/o the fix and works with, likewise with Egor's repro and the original one from the bug.

@AndyAyersMS
Copy link
Member Author

/azp run runtime-coreclr jitstress, Fuzzlyn, runtime-coreclr outerloop

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

for (int i = 0; i < 100; i++)
{
_ = Problem(p, n, (-1, -1));
Thread.Sleep(30);
Copy link
Member

@EgorBo EgorBo Dec 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repro needs Tiered Compilation, right? Then I presume you need to run some pipeline where it's not disabled (like pgo ones) (or add TC=1 in csproj)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this needs to have a GDV expansion to expose the bug.

@AndyAyersMS
Copy link
Member Author

/azp run runtime-coreclr pgo, runtime-coreclr pgostress

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@AndyAyersMS AndyAyersMS merged commit dcf2241 into dotnet:main Dec 4, 2023
298 checks passed
@AndyAyersMS
Copy link
Member Author

/backport to release/8.0-staging

Copy link
Contributor

github-actions bot commented Dec 4, 2023

Started backporting to release/8.0-staging: https://github.com/dotnet/runtime/actions/runs/7088765976

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Method changes behaviour after optimisation
2 participants