Don't block worker while evaluating a policy #354
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The Nomad Autoscaler v0.0.x had a single goroutine evaluating policies. When we introduced multiple
check
s in the v0.1.0 release we tried to parallelize as much work as possible to keep the goroutine as fast as possible.This created a race condition where the worker and the checks could be stuck waiting on each other.
In the v0.2.0 release we introduced the EvalBroker and Workers, so parallelization can now be provided at a higher level (per policy).
This PR turns the policy evaluation process into a linear execution to prevent the Worker from getting stuck. It also introduces checks for
ctx.Done()
before potentially long-running operations happen (namely, around plugin calls).Future work will include a heartbeat mechanism to detect and recreate workers that get stuck, then we can start parallelizing the
check
executions again.Closes #218 #303 #343