Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable logs/exec for EKS 1.19+ #120

Open
2 tasks done
adrienjt opened this issue Sep 17, 2021 · 4 comments · Fixed by #139
Open
2 tasks done

Enable logs/exec for EKS 1.19+ #120

adrienjt opened this issue Sep 17, 2021 · 4 comments · Fixed by #139
Labels
bug Something isn't working

Comments

@adrienjt
Copy link
Contributor

adrienjt commented Sep 17, 2021

EKS stopped signing node server certs with the certificates.k8s.io/v1beta1 API, but would sign them with the v1? (Other distributions,e.g., AKS and GKE, continue to sign node server certs with the certificates.k8s.io/v1beta1 API until 1.21 included, so EKS's behavior is surprising.)

You can continue to request that a CSR to is signed for a non-node server cert, webhooks (for example, with the certificates.k8s.io/v1beta1 API). [sic]

https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html#kubernetes-1.19

  • Hotfix solution: disable logs/exec if timeout waiting for cert to be signed, effectively dropping support for logs/exec on EKS 1.19+.
  • ASAP: upgrade k8s dependencies to at least 1.19 to send CSRs with certificates.k8s.io/v1 on Kubernetes 1.19+ clusters, but continue to send them with v1beta1 on 1.17-1.18, because v1beta1 was only introduced in 1.19. This will also help us support 1.22, where v1beta1 is unavailable.
@adrienjt adrienjt added the bug Something isn't working label Sep 17, 2021
@adrienjt adrienjt changed the title Admiralty hangs at startup on EKS 1.19+ Enable logs/exec for EKS 1.19+ Sep 29, 2021
@adrienjt
Copy link
Contributor Author

fixed by #139

@adrienjt
Copy link
Contributor Author

This is more complicated than I thought.

By design EKS does not issue certificates for CSRs with signerName "kubernetes.io/kubelet-serving" unless the CSR was actually requested by a kubelet. EKS's custom signer validates this by checking that the requested SANs for CSRs with signerName kubernetes.io/kubelet-serving match an actual EC2 instance's IPs/DNS names. In other words, EKS does not issue certificates for CSRs with signerName kubernetes.io/kubelet-serving posing as kubelets, it only issues certificates for CSRs with signerName kubernetes.io/kubelet-serving for actual kubelets.

aws/containers-roadmap#1604 (comment)

We must use that signer because this is the only one trusted by the API server when it calls logs/exec endpoints.

Currently, Admiralty creates and self-approves a CSR for the Admiralty controller-manager/agent pod IP, and the Node object representing each target specifies that IP as their address. The pod exposes logs/exec endpoints at port 10250, the default kubelet port.

Here's a trick I thought of: we could expose those endpoints with hostPort at a port different than the default (because the default is already taken by the kubelet of the node hosting the Admiralty pod), change the virtual node objects to use the Admiralty pod's hosting node IP as their address (status.addresses) and the chosen non-default port (status.daemonEndpoints.kubeletEndpoint.Port). The EKS control plane should sign the CSR because the IP is that of an actual EC2 instance. We'd want to make sure that security groups allow traffic from the API server to the virtual kubelet port.

Let's first confirm whether the EKS control plane would sign such a CSR...

cc @fakeburst

@adrienjt adrienjt reopened this Apr 15, 2022
@fakeburst
Copy link

@adrienjt I'll try this approach in my environment

@flowinh2o
Copy link

flowinh2o commented Jun 6, 2024

Any update on this? I am trying to get an eks cluster version 1.29 and seeing the pod logs/exec error in the controller logs.

main.go:329] timed out waiting for virtual kubelet serving certificate to be signed, pod logs/exec won't be supported

I am running the Admiralty helm chart version 0.16.0. Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants