Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Release/8.0] Remove preventing EH at shutdown #100836

Conversation

janvorli
Copy link
Member

@janvorli janvorli commented Apr 9, 2024

Backport of #100293 to release/8.0

Customer Impact

During the runtime shutdown, we stop handling exceptions at all and we also prevent entering EE at few places. A customer has hit an issue with Watson dump being generated from a .NET app on every machine reboot due to that. That app was a networking app that was listening on a socket. Shutdown of the app resulted in a SocketException being thrown, but the exception handling was prevented and so the exception was reported as an unhandled exception which caused the dump to be taken.

Testing

CI testing, local testing of a case equivalent to what the customer had - a grpc running as a service. Without the fix the issue reported by the customer can be reproduced, with the fix it is gone.

Risk

Low - the change just enables exceptions handling during shutdown. So where previously the app would crash with an unhandled exception, it will now get proper handling of that exception upto the point the process is torn down by the OS.

@janvorli janvorli added Servicing-consider Issue for next servicing release review area-ExceptionHandling-coreclr labels Apr 9, 2024
@janvorli janvorli added this to the 8.0.x milestone Apr 9, 2024
@janvorli janvorli requested a review from jkotas April 9, 2024 19:19
@janvorli janvorli self-assigned this Apr 9, 2024
@jeffschwMSFT jeffschwMSFT removed the Servicing-consider Issue for next servicing release review label Apr 9, 2024
@jeffschwMSFT
Copy link
Member

removing the servicing-consider for now. given the low/medium risk, let's ship a preview of 9 and check for feedback and then consider a backport

@jkotas
Copy link
Member

jkotas commented Apr 10, 2024

  • This issue existed since .NET Core 1.0. Why did we not hear about it earlier? Did some other change make it much more likely to be hit?
  • Is the full backport of the original PR required to address the customer issue? Should we do a smaller more targeted backport for servicing to reduce the risk?

@janvorli
Copy link
Member Author

This issue existed since .NET Core 1.0. Why did we not hear about it earlier? Did some other change make it much more likely to be hit?

I think the issue wouldn't reproduce in the customer's scenario if the issue #83093 wasn't there. That issue was introduced by changes in the CNTRL_SHUTDOWN_EVENT handling. Also, we disable software exceptions only on x86, so the app has to be a x86 windows service that is running during shutdown. Moreover, the issue is not observable if watson dumps are not enabled on the machine. So, I guess that is what's making the issue occurrence and observability low.

Is the full backport of the original PR required to address the customer issue? Should we do a smaller more targeted backport for servicing to reduce the risk?

That is a good point. I can scale it down to just remove the g_fNoExceptions, which is sufficient to fix the issue for the customer. The change would then be much smaller and I would then reclassify the risk as being low. @jeffschwMSFT does that make sense to you?

@jeffschwMSFT
Copy link
Member

That is a good point. I can scale it down to just remove the g_fNoExceptions, which is sufficient to fix the issue for the customer. The change would then be much smaller and I would then reclassify the risk as being low. @jeffschwMSFT does that make sense to you?

My preference is a lower risk fix which addresses the customer issue (eg. they validated that their scenario is now working as expected).

@janvorli janvorli force-pushed the port-remove-eh-and-ee-entering-prohibition branch from 6a19622 to afaa2a2 Compare April 10, 2024 16:03
@janvorli janvorli changed the title [Release/8.0] Remove preventing EH and entering EE at shutdown [Release/8.0] Remove preventing EH at shutdown Apr 10, 2024
@janvorli janvorli added the Servicing-consider Issue for next servicing release review label Apr 10, 2024
@janvorli
Copy link
Member Author

@jeffschwMSFT I have scaled down the change considerably to address just the issue the customer has reported. I believe the risk is low now, so I have re-added the servicing-consider tag.

Copy link
Member

@jeffschwMSFT jeffschwMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. I will take for consideration for 8.0.x.

please get a code review and ensure the customer has signed off that this addresses their issue.

@leecow leecow added the Servicing-approved Approved for servicing release label Apr 11, 2024
@leecow leecow modified the milestones: 8.0.x, 8.0.5 Apr 11, 2024
@ericstj ericstj merged commit a494d22 into dotnet:release/8.0-staging Apr 15, 2024
110 of 114 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators May 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-ExceptionHandling-coreclr Servicing-approved Approved for servicing release Servicing-consider Issue for next servicing release review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants