Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

label nodes with the name of the autoscaling group they belong to (if they belong to one) #884

Open
tkellen opened this issue Mar 24, 2024 · 13 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@tkellen
Copy link

tkellen commented Mar 24, 2024

What would you like to be added:
It would be extremely helpful for operators if nodes were labelled with the autoscaling group they belong to (if there is an autoscaling group associated with a given node). I think node.kubernetes.io/auto-scaling-group-name would be appropriate for this.

Why is this needed:
Alleviate the need to cross-reference autoscaling group instances in AWS with node names in kubectl.

I would be happy to submit a PR if this would be considered for acceptance. Let me know!

/kind feature

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 24, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If cloud-provider-aws contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mmerkes
Copy link
Contributor

mmerkes commented Apr 16, 2024

I don't think this is unreasonable as long as we don't need to add any additional API calls, and it looks like aws:autoscaling:groupName is set on the instances by ASG, so we can just read the tag and apply the label with no additional calls. Thoughts @cartermckinnon?

@cartermckinnon
Copy link
Contributor

cartermckinnon commented Apr 16, 2024

Sounds reasonable, looks like we'll get the tags from ec2:DescribeInstances. ASG guarantees that will be added at creation, not after launch: https://docs.aws.amazon.com/autoscaling/ec2/userguide/ec2-auto-scaling-tagging.html#tag-lifecycle

Took a look at the code paths and it's a bit of a mess. We should reuse the ec2:DescribeInstances response in our InstancesV2.InstanceMetadata implementation, we have a few redundant calls today. But we can pass this new label with the AdditionalLabels field.

I think we'll want to use a label under the *.k8s.aws namespace, not node.kubernetes.io

@tkellen
Copy link
Author

tkellen commented Apr 16, 2024

🙏🏻🙏🏻 the complexity reduction this would bring to my k8s node group upgrade script would be massive 🤞🏻.

@mmerkes
Copy link
Contributor

mmerkes commented Apr 16, 2024

+1 to what Carter is saying. There's currently one well known label in the cloud provider: topology.k8s.aws/zone-id. Something like node.k8s.aws/auto-scaling-group-name might be the one.

@tkellen
Copy link
Author

tkellen commented Apr 17, 2024

Is this something you're looking for an outside contributor to implement (not yet sure how I would test it or I would have made an attempt already) or should I sit tight and let y'all do your thing?

@cartermckinnon
Copy link
Contributor

I'll put something together, I'd like to do some cleanup anyway 😄

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 16, 2024
@tkellen
Copy link
Author

tkellen commented Jul 16, 2024

I am seeing the following labels applied to new nodes:
eks.amazonaws.com/nodegroup
eks.amazonaws.com/sourceLaunchTemplateId
eks.amazonaws.com/sourceLaunchTemplateVersion

Out of curiosity, where was this implemented? I don't see it in this repo?

@mmerkes
Copy link
Contributor

mmerkes commented Jul 16, 2024

I am seeing the following labels applied to new nodes: eks.amazonaws.com/nodegroup eks.amazonaws.com/sourceLaunchTemplateId eks.amazonaws.com/sourceLaunchTemplateVersion

Out of curiosity, where was this implemented? I don't see it in this repo?

That's not part of the cloud provider. You must be using EKS Managed Node Groups, which applies those labels.

@tkellen
Copy link
Author

tkellen commented Jul 16, 2024

Ah! That makes sense, you're absolutely right. So many clusters. So much config 😵‍💫.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 15, 2024
@tkellen
Copy link
Author

tkellen commented Aug 15, 2024

/remove-lifecycle-rotten

checking in again to see if y'all are desirous of a contribution for this? some small guidance on testing and the cleanup you'd hoped to implement would be enough for me to dig in, I imagine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

5 participants