-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release/6.0] Correctly handle HFA structs when crossgen'd on Arm64 #80872
Conversation
Description: This change puts HFA calculation in Crossgen2 in sync with native CoreCLR runtime for value types with explicit layout. Previously Crossgen2 had a shortcut in the routine deciding that structs with explicit layouts are never marked as HFA that disagreed with the CoreCLR runtime; consequently, on arm64, Crossgen2 disagreed with the runtime on whether or not a function returning such a type should allocate the stack slot for return value, basically messing up the calling convention and GC refmap, resulting in various random AVs and corruptions. This was first observed by an internal customer in WPF apps where MilRectD is the type in question, later JanK filed the issue 79327 for the same problem. Customer impact: Random runtime crashes on arm64. Regression: Nope, I believe the incomplete implementation was the original one, this change just "improves it" by putting it in better sync with the native runtime. I have also added a code comment mentioning that these two need to be kept in sync. Risk: Low - the error in the previous implementation is obvious, R2RDump and my new runtime diagnostics clearly show the GC refmap mismatch caused by this problem and its fixing after applying the Crossgen2 fix. Link to issue: dotnet#79327 Link to PR against main: dotnet#80218 Publishing impact: In the particular case of the WPF app the problem was in the PresentationCore.dll assembly. The assembly (or rather the entire WPF) need to be recompiled with Crossgen2 with the fix applied for this to take effect. For now I assume that is an automated part of the servicing process. Thanks Tomas
I have worked offline with the internal customer, I provided him with a private build of PresentationCore.dll using Crossgen2 compiled off this branch and he replied after a couple of days that the previously observed crashes seem to have been fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved. we will take for consideration in 6.0.x
Closing and reopening to rebase the PR and re-run the CI. There were too many failures and unfortunately the logs are already deleted. Edit: Need to close and reopen a second time because I found some more ARM64 helix queues not found. I merged the fixes. |
Approved by Tactics for 6.0.15. |
Description:
This change puts HFA calculation in Crossgen2 in sync with native CoreCLR runtime for value types with explicit layout. Previously Crossgen2 had a shortcut in the routine deciding that structs with explicit layouts are never marked as HFA that disagreed with the CoreCLR runtime; consequently, on arm64, Crossgen2 disagreed with the runtime on whether or not a function returning such a type should allocate the stack slot for return value, basically messing up the calling convention and GC refmap, resulting in various random AVs and corruptions. This was first observed by an internal customer in WPF apps where MilRectD is the type in question, later JanK filed the issue 79327 for the same problem.
Customer impact:
Random runtime crashes on arm64.
Regression:
Nope, I believe the incomplete implementation was the original one, this change just "improves it" by putting it in better sync with the native runtime. I have also added a code comment mentioning that these two need to be kept in sync.
Risk:
Low - the error in the previous implementation is obvious, R2RDump and my new runtime diagnostics clearly show the GC refmap mismatch caused by this problem and its fixing after applying the Crossgen2 fix.
Link to issue:
#79327
Link to PR against main:
#80218
Publishing impact:
In the particular case of the WPF app the problem was in the PresentationCore.dll assembly. The assembly (or rather the entire WPF) need to be recompiled with Crossgen2 with the fix applied for this to take effect. For now I assume that is an automated part of the servicing process.
Thanks
Tomas