-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autoscaling of the high availability service #2639
Comments
GoalsFrom the technical point of view we would like to achieve the following goals:
ImplementationI suggest using an additional autoscaling service which can horizontally autoscale both kubernetes deployments and kubernetes nodes in order to achieve some predefined target utilization. The following key points give more in-depth understanding of the approach.
AlgorithmThe following autoscaling algorithm can be used by the autoscaling service.
ConfigurationThe following settings shall be configured for the autoscaling service to work:
Questions
|
The follow autoscaler parameters were checked:
|
Background
As separate Cloud Pipeline deployments may face periodically with large workload peaks, it would be useful to implement an autoscaling of the system nodes (HA service) - to allow scaling up and down of the service according to the actual workload.
Approach
We shall monitor the state of the system API instances (at least, their RAM and/or CPUs consumption).
HA service shall have the minimum limit of instances to run itself.
If the consumption exceeds some predefined threshold during some time - new instances on the system needs shall be launched (i.e. HA service shall be scaled up).
If the workload is subsided, then additional instances shall be stopped (i.e. HA service shall be scaled down - but to not less than predefined minimal limit of instances).
I suppose that described behavior shall be managed by some new system preferences, e.g.:
Additionally
The text was updated successfully, but these errors were encountered: