-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Methods with both stackalloc and loops bypass tiering #85548
Comments
@AndyAyersMS currently we sort of promote stackalloc to locals for size <= 32 bytes - do we still bypass tier0 for small stackallocs? and does it makes sense to extend that limit for OSR? |
We decide we need to switch to full opt before importing, so before we know for sure how big the stackalloc(s) might be. It looks like we try to do this conversion in tier0 too (in fact possibly even in minops, though both may end up getting blocked in practice as I don't know if upping the limit makes sense or not, but if we do we should do it across the board and not just for these cases. |
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsFor example: runtime/src/libraries/System.Text.Json/src/System/Text/Json/Document/JsonDocument.TryGetProperty.cs Lines 135 to 149 in 231f85f
Our current method of tiering for methods with loops relies on OSR, but OSR currently cannot handle stackalloc. Instead, such methods are initially jitted with full optimization. Thus they do not benefit from tiering or Dynamic PGO. Note this is not a new issue; things have been this way since .NET 7 where we enabled OSR, but it may become more pressing now that we are considering enabling PGO by default. I don't know how many cases of this there are yet, or what if any performance we might gain by trying to address this. But I ran across one example here: #84264 (comment). Adding stackalloc support to OSR is currently considered to be difficult. A method with stackalloc has two independently addressable areas of its stack frame (commonly handled by using both stack pointer and frame pointer to address locals). But an OSR method must potentially support three areas, and we currently do not have support in the jit or the diagnostics stack for a third stack frame base register. If this combination turns out to be relatively uncommon, we might consider refactoring the code to avoid this pattern. cc @dotnet/jit-contrib @stephentoub
|
Recall OSR is an insurance policy to make sure we can escape from Tier0 code no matter what the method does—in particular it handles methods that have very long running loops. Since |
The linked test ( A different screen is to find out how many methods have this combination and perhaps from there find different impacted tests; let me try that. |
For example:
runtime/src/libraries/System.Text.Json/src/System/Text/Json/Document/JsonDocument.TryGetProperty.cs
Lines 135 to 149 in 231f85f
Our current method of tiering for methods with loops relies on OSR, but OSR currently cannot handle stackalloc. Instead, such methods are initially jitted with full optimization. Thus they do not benefit from tiering or Dynamic PGO.
Note this is not a new issue; things have been this way since .NET 7 where we enabled OSR, but it may become more pressing now that we are considering enabling PGO by default.
I don't know how many cases of this there are yet, or what if any performance we might gain by trying to address this. But I ran across one example here: #84264 (comment).
Adding stackalloc support to OSR is currently considered to be difficult. A method with stackalloc has two independently addressable areas of its stack frame (commonly handled by using both stack pointer and frame pointer to address locals). But an OSR method must potentially support three areas, and we currently do not have support in the jit or the diagnostics stack for a third stack frame base register.
If this combination turns out to be relatively uncommon, we might consider refactoring the code to avoid this pattern.
cc @dotnet/jit-contrib @stephentoub
The text was updated successfully, but these errors were encountered: