Pause and unpause the scheduler from the viz #2145
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Add a pause button to the scheduler that stops get_work from returning jobs. The button lives at the right-hand side of the header and looks like this:
When the scheduler is paused, it looks like this:
Pausing will eventually kill any workers that don't have keep-alive set.
Motivation and Context
Sometimes it can be necessary to stop all jobs. This usually happens when you're having a pipeline issue or you want to deploy a new scheduler. There hasn't been a nice way to accomplish this. This commit adds a pause toggle to the visualiser header bar. With one click, the scheduler will stop giving out new tasks and allow currently running tasks to finish normally. With another click, the scheduler starts giving out jobs again.
Have you tested this? If so, how?
I've been using this in production for about half a year now. I also tested the visualiser from this PR locally and there are unit tests for the scheduler side.