- Kubernetes controller to repair nodes which are
NotReady
by replacing them with new fresh nodes. This is done by manipulatingAutoScalingGroups
to repair the nodes. - Currently supports only AWS cloud provider.
- This component is not used by gardener anymore and no longer maintained. It was archived in the gardener-attic.
- Control loop for each Auto Scaling Group configured for a shoot cluster :
- Identify
Nodes
which areNotReady
since configurable amount of time (~10 minutes). - Create new nodes and wait until they are
Ready
- Cordon and drain all
NotReady
nodes. - Delete the
NotReady
nodes.
- Identify
- Apply this approach for each ASG in a shoot cluster one by one.
- For a given ASG, create excess
Nodes
in parallel but cordon, drain and deleteNodes
one by one. - If ASG does not have sufficient capacity for excess
Nodes
, first delete theNotReady
nodes then create new one.
Command | Implication |
---|---|
Make compile | Build the go code locally |
Make release | Deploy image into Gcloud |
Use the deploy/kubernetes/deployment.yaml
to deploy the auto-node-repair into the cluster. Refer to this file for more details.