Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix JobDispatcher crash during force cancellation. #3118

Merged
merged 1 commit into from
Jan 30, 2024

Conversation

TingluoHuang
Copy link
Member

The JobDispatcher will crash in this scenario:

  • Job A lands on runner
  • Runner start worker process
  • Job A finish on worker process, but somehow the worker process didn't exit fast enough
  • Job B lands on the runner
  • Runner noticed worker process for Job A is still running, so it try to cancel and gracefully shutdown the worker process by sending a cancellation message via AnonymousPipe
  • Worker process exit right before runner try to cancel it.
  • The message sending to AnonymousPipe failed since the worker exited, exception get thrown and crashed the JobDispatcher

The fix: add another catch(Exception) to make sure the failure on sending cancellation don't crash the JobDispatcher.

https://github.com/github/c2c-actions-support/issues/3423

@TingluoHuang TingluoHuang requested a review from a team as a code owner January 30, 2024 18:51
@TingluoHuang TingluoHuang force-pushed the users/tihuang/fixdispatcher branch from 91ee4ed to 10788df Compare January 30, 2024 19:23
@TingluoHuang TingluoHuang merged commit 5268d74 into main Jan 30, 2024
10 checks passed
@TingluoHuang TingluoHuang deleted the users/tihuang/fixdispatcher branch January 30, 2024 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants