-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Extend escape analysis to account for arrays with non-gcref elements #104906
base: main
Are you sure you want to change the base?
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For arrays (and also perhaps boxes and ref classes) we ought to have some kind of size limit... possibly similar to the one we use for stackallocs.
We need to be careful we don't allocate a lot of stack for an object that might not be heavily used, as we'll pay per-call prolog zeroing costs.
It's on my branch. Will open a draft PR soon to validate CI and see how much benefits we can gain from the field-wise analysis. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
Tests (modulus unrelated) are passing.
with field-wise analysis we are missing context for more methods, so it's possible that spmi reports worse stats. |
I collected the method context against BCL by myself to prevent missing method context.
|
It's actually a small change as it relies on the existing analysis and added the treatment to arrays as well, which is a "free" improvement. I personally would like to see it goes to .NET 9. |
This is optimization, not correctness issue. We don't have time to work on this for .NET 9, so we will review it in .NET 10. |
@AndyAyersMS please review this community PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions along with the particular notes:
- do you see diffs with these changes when
DOTNET_JitObjectStackAllocationArray=0
? - are stack allocated arrays amenable to physical promotion? If not, why not?
- what does codegen look like for a locally allocated array consumed by a loop? Is it at least as good as the heap case?
- we need a tighter size limit, 8192 is too high. We likely need one both for individual allocations (especially if conditional) and for the overall amount of stack size increase. For individual allocations we should either match the stackalloc -> local size limit or gather data suggesting why a different choice is the right one. For the overall limit I'm less sure, but something around 8192 for all stack allocated locals seems about right.
- given that, we might need a prioritization scheme, if we have so many allocation opportunities that some get left out. But I expect this will be rare.
} | ||
else | ||
{ | ||
*use = m_compiler->gtNewOperNode(GT_COMMA, node->TypeGet(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to handle the out of bounds cases here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC not handing it will make the negative case address exposed, while inserting a BOUND_CHECK node will lead to an assert in later phase.
src/coreclr/jit/morph.cpp
Outdated
@@ -8358,10 +8358,20 @@ GenTree* Compiler::fgMorphSmpOp(GenTree* tree, MorphAddrContext* mac, bool* optA | |||
return tree; | |||
} | |||
|
|||
#ifdef DEBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I noted over in #107542 we don't want to have control flow logic conditional on DEBUG
like this... we need to find a better fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm reverting changes to this and instead wait for #107542 to be fixed first.
I will test it and post the result later.
I think yes? int[] a = new[] { 1,2,3,4 };
return a[2]; will produce the following codegen: mov rax, 3
ret
We need #102965 to get better codegen for this case. Otherwise it gets address exposed.
Agree. I think we can do this in a following PR? |
Just checked superpmi and there's no stack allocated array when |
Test failures are #107542. I'm going to leave it as is. |
Positive case:
Codegen:
Negative case:
Codegen:
Benchmark on Mandelbrot:
Diff: https://www.diffchecker.com/bNP4qHdF/