Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set provider ID for AWS nodes #951

Closed
wants to merge 1 commit into from
Closed

Set provider ID for AWS nodes #951

wants to merge 1 commit into from

Conversation

vixus0
Copy link
Contributor

@vixus0 vixus0 commented Mar 18, 2021

Sets the provider ID for controller and worker nodes on AWS to aws:///<az>/<instance id>.

  • Use afterburn.service on Fedora and coreos-metadata.service on Flatcar to write instance metadata to file.
  • Read instance metadata file in kubelet.service.
  • Set --provider-id in kubelet arguments.

Testing

  • Provisioned two clusters on AWS, one each of Fedora and Flatcar.
  • Checked kubectl describe node had a correct value for ProviderID in the format aws:///<az>/<instance id> for both controllers and workers.

Rationale

Autoscaling can be pretty essential for many cluster operators.
Without a provider ID, it is impossible (as far as I know) to use tools such as cluster-autoscaler.

I understand this change isn't ideal as it makes use of the instance metadata service only available on AWS EC2 instances.
However, given there is an aws subdirectory, we already have some explicit notion of which cloud provider we're running against.

Alternative options

  • Autoscale some other way, but nothing comes readily to mind.
  • Set hostname in kubelet to the value of hostname --fqdn.
    This would maybe provide the option for running with an external cloud-controller-manager as it will match the node name to the EC2 instance and set providerID accordingly.
    But then we might need a way to set --cloud-provider=external as well, not sure.

@vixus0 vixus0 changed the title Set kubelet provider-id for AWS workers Set provider ID for AWS workers Mar 18, 2021
@dghubble
Copy link
Member

I think the relevant link from the cluster-autoscaler project is here with their needs. I'm open to setting provider-id for other unrelated reasons (on controllers too).

However, this should not need an additional systemd unit. Fedora CoreOS has afterburn and Flatcar Linux has a fork as well. At present, the only platforms that need afterburn enabled are DigitalOcan so you can see them for examples. There is some potential reliability concern since metadata can be a new flake point in bringup.

@vixus0
Copy link
Contributor Author

vixus0 commented Mar 18, 2021

Yes, the only bits missing from Typhoon that I could see were the following:

If you're managing your own kubelets, they need to be started with the --provider-id flag. The provider id has the format aws:////, e.g. aws:///us-east-1a/i-01234abcdef.

There's also the requirement for adding extra tags to the auto scaling group but that can be done with a local-exec provisioner.

I'll look into using afterburn and then I'll update the PR, thanks for the pointer.

* Use afterburn on Fedora CoreOS and coreos-metadata on Flatcar to fetch
  instance metadata on AWS controller and worker nodes.
* Set --provider-id flag on kubelet from instance metadata.
@vixus0
Copy link
Contributor Author

vixus0 commented Mar 19, 2021

After spending some time faffing around with Fedora's recent SSH key issues I've managed to successfully test the changes using afterburn and coreos-metadata.

Testing:

  • Provisioned two clusters on AWS, one each of Fedora and Flatcar.
  • Checked kubectl describe node had a correct value for ProviderID in the format aws:///<az>/<instance id> for both controllers and workers.

@vixus0 vixus0 changed the title Set provider ID for AWS workers Set provider ID for AWS nodes Mar 19, 2021
dghubble pushed a commit that referenced this pull request Mar 19, 2021
* Set the Kubelet `--provider-id` on AWS based on metadata from
Fedora CoreOS afterburn or Flatcar Linux coreos-metadata
* Based on #951
dghubble-robot pushed a commit to poseidon/terraform-aws-kubernetes that referenced this pull request Mar 19, 2021
* Set the Kubelet `--provider-id` on AWS based on metadata from
Fedora CoreOS afterburn or Flatcar Linux coreos-metadata
* Based on poseidon/typhoon#951
@dghubble
Copy link
Member

Rebased and merged as 507c646, thanks Anshul

@dghubble dghubble closed this Mar 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants