Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] script allows to pass incorrect kublete arguments #1791

Open
AlexNabokikh opened this issue May 10, 2024 · 1 comment
Open

[bug] script allows to pass incorrect kublete arguments #1791

AlexNabokikh opened this issue May 10, 2024 · 1 comment

Comments

@AlexNabokikh
Copy link

Hey!

Issue

Recently, at the company I work for we had an incident caused by incorrect arguments being passed to the kubelet via --kubelet-extra-args in the EKS terraform configuration.
These arguments are being passed to the bootstrap.sh by the Terraform provider, and it seems that the script accepts incorrect arguments but doesn't check later if the kubelet has started.
Nodes with incorrect kubelet arguments cannot start the kubelet and thus join the cluster. Despite the above, EKS does not consider such nodes unhealthy.

Proposed solution

Add checks to determine whether the kubelet has started or not.

Very roughly:

if systemctl is-active --quiet kubelet; then
  log "INFO: kubelet service is active and running."
else
  log "ERROR: kubelet service failed to start."
  exit 1  # Exit if kubelet did not start successfully
fi
@cartermckinnon
Copy link
Member

I think it'd be a worthwhile improvement to exit bootstrap.sh with a non-zero code if the kubelet unit doesn't become active, but that won't do much in isolation because the outcome of cloud-init user data isn't reported to EC2 (or EKS). Feel free to open a PR 👍 How an orchestrator (EKS managed nodes, Karpenter, cluster-autoscaler, etc.) handles this situation is a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants