EKS Auto Drain

Gracefully drain EKS Worker Nodes whenever a node is terminated by an Auto Scaling Group or a Spot termination.

Deployable Lambda function with CloudWatch Event Rules and an IAM Role, enabled by adding a Lifecycle hook to any Auto Scaling Group in the same AWS Region.

Deploy

Deployment of the Lambda, IAM Role and Cloudwatch Event Rules can be simplified with the SAM CLI and Docker.

SAM CLI can be installed differently depending on your OS, in general for Linux and MacOS however..

brew upgrade
brew update
brew tap aws/tap
brew install aws-sam-cli
sam --version

Build, Package, and Deploy using SAM

Clone this repository

git clone https://github.com/dkeightley/eks-auto-drain.git
cd eks-auto-drain

Optional: set your AWS region and create an S3 bucket

export AWS_DEFAULT_REGION=<region name>
aws s3 mb s3://<bucket name>

Build, package and deploy the project with SAM

sam build --use-container
sam package --output-template-file packaged.yaml --s3-bucket <bucket name>
sam deploy --template-file packaged.yaml --stack-name eks-auto-drain --capabilities CAPABILITY_IAM

Configure

To provide RBAC permissions for the drain, an RBAC group that provides the specific permissions is needed. Once added the Lambda execution role can be mapped to the group in the aws-auth ConfigMap for each EKS Cluster

Deploy the RBAC ClusterRole and ClusterRoleBinding for each Cluster

kubectl apply -f rbac/

Obtain the Lambda execution Role

aws cloudformation describe-stacks --stack-name eks-auto-drain --query 'Stacks[0].Outputs[0].OutputValue'

Add a mapping for the Role to the ClusterRole for each Cluster

Use an imperative action, like edit, to add to the ConfigMap to avoid merge conflicts

kubectl edit -n kube-system configmap aws-auth

Example:

mapRoles: |
    - groups:
      - eks-auto-drain-lambda
      rolearn: <Lambda execution Role>
      username: eks-auto-drain-lambda

Add a Lifecycle hook to each Auto Scaling Group for the Nodes in each Cluster

Note: A heart beat timeout of 300s is provided here, adjust as needed, it will serve as an overall grace period before continuing with terminating the Node

Define a variable to loop through these, otherwise the below command can be used for each ASG

aws autoscaling put-lifecycle-hook --lifecycle-hook-name eks-auto-drain --lifecycle-transition "autoscaling:EC2_INSTANCE_TERMINATING" --heartbeat-timeout 300 --default-result CONTINUE --auto-scaling-group-name <auto scaling group name>

OR

./put-lifecycle-hook.sh asg1 asg2 asg3

Test

Testing by terminating an instance

Obtain a list of instances in a Cluster:

kubectl get nodes -o=custom-columns=NAME:.metadata.name,INSTANCE:.spec.providerID

Test by terminating an instance in an ASG

aws autoscaling terminate-instance-in-auto-scaling-group --no-should-decrement-desired-capacity --instance-id <instance id>

The Node should be cordoned, and drained of all Pods before termination. The Lambda function logs can provide output.

sam logs --name LambdaFunction --stack-name eks-auto-drain --tail

Local testing with the SAM CLI

The provided event.json contains an invalid instance id so will fail, however, replace with a valid instance from your cluster to ensure the drain occurs

sam local invoke -e misc/event.json

Cleanup

kubectl delete -f rbac/
aws cloudformation delete-stack --stack-name eks-auto-drain

./delete-lifecycle-hook.sh asg1 asg2 asg3

TODO

VPC support for private access

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
img		img
misc		misc
rbac		rbac
src		src
.gitignore		.gitignore
README.md		README.md
delete-lifecycle-hook.sh		delete-lifecycle-hook.sh
eks-auto-drain.json		eks-auto-drain.json
put-lifecycle-hook.sh		put-lifecycle-hook.sh
template.yaml		template.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EKS Auto Drain

Deploy

Build, Package, and Deploy using SAM

Configure

Deploy the RBAC ClusterRole and ClusterRoleBinding for each Cluster

Obtain the Lambda execution Role

Add a mapping for the Role to the ClusterRole for each Cluster

Add a Lifecycle hook to each Auto Scaling Group for the Nodes in each Cluster

Test

Testing by terminating an instance

Local testing with the SAM CLI

Cleanup

TODO

About

Releases

Packages

Languages

dkeightley/eks-auto-drain

Folders and files

Latest commit

History

Repository files navigation

EKS Auto Drain

Deploy

Build, Package, and Deploy using SAM

Configure

Deploy the RBAC ClusterRole and ClusterRoleBinding for each Cluster

Obtain the Lambda execution Role

Add a mapping for the Role to the ClusterRole for each Cluster

Add a Lifecycle hook to each Auto Scaling Group for the Nodes in each Cluster

Test

Testing by terminating an instance

Local testing with the SAM CLI

Cleanup

TODO

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages