-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter managed non-ASG nodes by tag #669
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as you address Brandon's comment (add back the old names for the config values and mark them as deprecated), I am good with you merging this.
I recommend we use a minor version bump when we release (i.e., v1.17.0
), since this may potentially break customers who are using ASG without propagating tags to their instances.
7fa3eb0
to
25df067
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think localstack doesn't like that there's not a Value when specifying the tag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested these changes on EKS and verified that they work as expected: NTH no longer calls ASG and does not throw errors when ASG permissions are revoken from its IAM role. NTH checks the tag of the node that receives the ITN, even when that node is not associated with an ASG.
Here's how I tested:
- Create an EKS cluster with a managed node group
- Create an SQS queue and EventBridge rules using the CloudFormation template in this repo
- Modify the node IAM role to add the following inline policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"autoscaling:CompleteLifecycleAction",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeTags",
"sqs:DeleteMessage",
"sqs:ReceiveMessage"
],
"Resource": "*"
}
]
}
- Install NTH v1.16.5 into the cluster in Queue Processor mode
curl -L https://github.com/aws/aws-node-termination-handler/releases/download/v1.16.5/all-resources-queue-processor.yaml -o nth-1.16.5.yaml
# Open the YAML file and update QUEUE_URL value
kubectl apply -f ./nth-1.16.5.yaml
- Launch an EC2 instance:
simple-ec2 launch --capacity-type spot --tags aws-node-termination-handler/managed=
- Send that instance a Spot ITN:
ec2-spot-interrupter --interactive
- Verify that NTH receives the message from the queue but does not act on it
Now to test it with the new code.
- Modify the node IAM role to remove permissions to ASG from the inline policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"sqs:DeleteMessage",
"sqs:ReceiveMessage"
],
"Resource": "*"
}
]
}
- Build NTH locally with these changes (
make build-docker-images
), push that container to a private ECR repo, and install that version of NTH into the EKS cluster
make build-docker-images
docker tag amazon/aws-node-termination-handler:<TAG_OF_LATEST_BUILD> <ECR_REPO_URL>:v1.16.5-gc7b6f88-linux-amd64
docker push <ECR_REPO_URL>:v1.16.5-gc7b6f88-linux-amd64
cp nth-1.16.5.yaml nth-my-build.yaml
# Edit nth-my-build.yaml to point to the ECR image we just pushed
kubectl apply -f ./nth-my-build.yaml
- Launch and EC2 instance and sent it a Spot ITN, as before
- Verify that NTH received the message and attempts to act on it, and that NTH emits no error messages about lack of permissions to access ASG. ✅
Issue #, if available: #654
Description of changes:
check-asg-tag-before-draining
andmanaged-asg-tag
withcheck-tag-before-draining
andmanaged-tag
, updated docs to mark them as deprecated, and removed them from chartsmanagedAsgTag/managedTag
on an instance wouldn't be checked correctly before deferring to ASG lookupBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.