Skip to content
This repository has been archived by the owner on Jun 29, 2022. It is now read-only.

Remove enable_tls_bootstrap flag #851

Open
surajssd opened this issue Aug 25, 2020 · 5 comments
Open

Remove enable_tls_bootstrap flag #851

surajssd opened this issue Aug 25, 2020 · 5 comments
Labels
area/kubernetes Core Kubernetes stuff area/stability Stability kind/cleanup Cleanup issues size/s Issues which likely require up to a couple of work hours technical-debt Technical debt-related issues

Comments

@surajssd
Copy link
Member

Once the static kubelet kubeconfig is decided upon to be deprecated and TLS bootstrap becomes default remove this variable. While doing that remove following code:

    kubelet: Remove all static certs
    
    This commit removes all the certs generated statically for kubelet.
    These certs are no longer needed with bootstrap kubelet.
    

diff --git a/assets/lokomotive-kubernetes/bootkube/assets.tf b/assets/lokomotive-kubernetes/bootkube/assets.tf
index 49077916d..5f608f488 100644
--- a/assets/lokomotive-kubernetes/bootkube/assets.tf
+++ b/assets/lokomotive-kubernetes/bootkube/assets.tf
@@ -123,12 +123,6 @@ resource "template_dir" "kubelet" {
   destination_dir = "${var.asset_dir}/charts/kube-system/kubelet"
 }
 
-# Generated kubeconfig for Kubelets
-resource "local_file" "kubeconfig-kubelet" {
-  content  = data.template_file.kubeconfig-kubelet.rendered
-  filename = "${var.asset_dir}/auth/kubeconfig-kubelet"
-}
-
 # Generated admin kubeconfig (bootkube requires it be at auth/kubeconfig)
 # https://github.com/kubernetes-incubator/bootkube/blob/master/pkg/bootkube/bootkube.go#L42
 resource "local_file" "kubeconfig-admin" {
@@ -142,17 +136,6 @@ resource "local_file" "kubeconfig-admin-named" {
   filename = "${var.asset_dir}/auth/${var.cluster_name}-config"
 }
 
-data "template_file" "kubeconfig-kubelet" {
-  template = file("${path.module}/resources/kubeconfig-kubelet")
-
-  vars = {
-    ca_cert      = base64encode(tls_self_signed_cert.kube-ca.cert_pem)
-    kubelet_cert = base64encode(tls_locally_signed_cert.kubelet.cert_pem)
-    kubelet_key  = base64encode(tls_private_key.kubelet.private_key_pem)
-    server       = format("https://%s:%s", var.api_servers[0], var.external_apiserver_port)
-  }
-}
-
 # If var.api_servers_external isn't set, use var.api_servers.
 # This is for supporting separate API server URLs for external clients in a backward-compatible way.
 # The use of split() and join() here is because Terraform's conditional operator ('?') cannot be
diff --git a/assets/lokomotive-kubernetes/bootkube/outputs.tf b/assets/lokomotive-kubernetes/bootkube/outputs.tf
index f70ddd49a..f19cd5aeb 100644
--- a/assets/lokomotive-kubernetes/bootkube/outputs.tf
+++ b/assets/lokomotive-kubernetes/bootkube/outputs.tf
@@ -2,11 +2,6 @@ output "cluster_dns_service_ip" {
   value = cidrhost(var.service_cidr, 10)
 }
 
-// Generated kubeconfig for Kubelets (i.e. lower privilege than admin)
-output "kubeconfig-kubelet" {
-  value = data.template_file.kubeconfig-kubelet.rendered
-}
-
 // Generated kubeconfig for admins (i.e. human super-user)
 output "kubeconfig-admin" {
   value = data.template_file.kubeconfig-admin.rendered
@@ -50,14 +45,6 @@ output "ca_cert" {
   value = base64encode(tls_self_signed_cert.kube-ca.cert_pem)
 }
 
-output "kubelet_cert" {
-  value = base64encode(tls_locally_signed_cert.kubelet.cert_pem)
-}
-
-output "kubelet_key" {
-  value = base64encode(tls_private_key.kubelet.private_key_pem)
-}
-
 output "server" {
   value = format("https://%s:%s", var.api_servers[0], var.external_apiserver_port)
 }
diff --git a/assets/lokomotive-kubernetes/bootkube/resources/kubeconfig-kubelet b/assets/lokomotive-kubernetes/bootkube/resources/kubeconfig-kubelet
deleted file mode 100644
index 5d8fb0c3c..000000000
--- a/assets/lokomotive-kubernetes/bootkube/resources/kubeconfig-kubelet
+++ /dev/null
@@ -1,16 +0,0 @@
-apiVersion: v1
-kind: Config
-clusters:
-- name: local
-  cluster:
-    server: ${server}
-    certificate-authority-data: ${ca_cert}
-users:
-- name: kubelet
-  user:
-    client-certificate-data: ${kubelet_cert}
-    client-key-data: ${kubelet_key}
-contexts:
-- context:
-    cluster: local
-    user: kubelet
diff --git a/assets/lokomotive-kubernetes/bootkube/tls-k8s.tf b/assets/lokomotive-kubernetes/bootkube/tls-k8s.tf
index b7fcd259b..61e461707 100644
--- a/assets/lokomotive-kubernetes/bootkube/tls-k8s.tf
+++ b/assets/lokomotive-kubernetes/bootkube/tls-k8s.tf
@@ -148,47 +148,3 @@ resource "local_file" "service-account-crt" {
   content  = tls_private_key.service-account.public_key_pem
   filename = "${var.asset_dir}/tls/service-account.pub"
 }
-
-# Kubelet
-
-resource "tls_private_key" "kubelet" {
-  algorithm = "RSA"
-  rsa_bits  = "2048"
-}
-
-resource "tls_cert_request" "kubelet" {
-  key_algorithm   = tls_private_key.kubelet.algorithm
-  private_key_pem = tls_private_key.kubelet.private_key_pem
-
-  subject {
-    common_name  = "kubelet"
-    organization = "system:nodes"
-  }
-}
-
-resource "tls_locally_signed_cert" "kubelet" {
-  cert_request_pem = tls_cert_request.kubelet.cert_request_pem
-
-  ca_key_algorithm   = tls_self_signed_cert.kube-ca.key_algorithm
-  ca_private_key_pem = tls_private_key.kube-ca.private_key_pem
-  ca_cert_pem        = tls_self_signed_cert.kube-ca.cert_pem
-
-  validity_period_hours = var.certs_validity_period_hours
-
-  allowed_uses = [
-    "key_encipherment",
-    "digital_signature",
-    "server_auth",
-    "client_auth",
-  ]
-}
-
-resource "local_file" "kubelet-key" {
-  content  = tls_private_key.kubelet.private_key_pem
-  filename = "${var.asset_dir}/tls/kubelet.key"
-}
-
-resource "local_file" "kubelet-crt" {
-  content  = tls_locally_signed_cert.kubelet.cert_pem
-  filename = "${var.asset_dir}/tls/kubelet.crt"
-}
@surajssd surajssd added area/kubernetes Core Kubernetes stuff area/stability Stability kind/cleanup Cleanup issues labels Aug 25, 2020
@surajssd
Copy link
Member Author

Deprecation period is yet to be decided, to be discussed with the whole team.

Also only after this PR is merged: #618.

@invidian
Copy link
Member

invidian commented Sep 7, 2020

As part of this, we can also consider keeping old RBAC rules for at least one release. Then TLS bootstrapping can be re-enabled on the cluster and existing nodes will still be functional. That allows replacing them one by one by the user, which allows smooth transition to TLS bootstrapping, without a need to re-create the cluster. Then the release after that we can remove those things.

@invidian
Copy link
Member

invidian commented Sep 8, 2020

As part of this, we can also consider keeping old RBAC rules for at least one release. Then TLS bootstrapping can be re-enabled on the cluster and existing nodes will still be functional. That allows replacing them one by one by the user, which allows smooth transition to TLS bootstrapping, without a need to re-create the cluster. Then the release after that we can remove those things.

So I looked how this works in practice and here are my findings:

  • verifyCluster() should probably be called after upgrading system components: New worker pools cannot be added with TLS bootstrap enabled #916
  • if we enable TLS bootstrapping, but we do not update self-hosted kubelet, we create an issue, that new nodes which joins via TLS bootstrapping gets self-hosted kubelet scheduled, which do not use bootstrapping, so it will be using bootstrap kubeconfig from host as regular kubeconfig, which won't work.
  • we cannot simply update self-hosted kubelet, because of runc issue (Enable self-hosted kubelet updates by default #110). However, if one uses Flatcar Alpha, this can be performed and with the patch below, old kubeconfig files can be used to migrate to TLS bootstrapping just fine.

Patch to keep existing kubeconfig files working and allowing them to be used to migrate to TLS bootstrapping:

diff --git a/assets/charts/control-plane/kubernetes/templates/bootstrap-cluster-role-binding.yaml b/assets/charts/control-plane/kubernetes/templates/bootstrap-cluster-role-binding.yaml
index c6197364..9af5d8a8 100644
--- a/assets/charts/control-plane/kubernetes/templates/bootstrap-cluster-role-binding.yaml
+++ b/assets/charts/control-plane/kubernetes/templates/bootstrap-cluster-role-binding.yaml
@@ -7,6 +7,9 @@ subjects:
 - kind: Group
   name: system:bootstrappers
   apiGroup: rbac.authorization.k8s.io
+- kind: Group
+  name: system:nodes
+  apiGroup: rbac.authorization.k8s.io
 roleRef:
   kind: ClusterRole
   name: system:node-bootstrapper
@@ -21,6 +24,9 @@ subjects:
 - kind: Group
   name: system:bootstrappers
   apiGroup: rbac.authorization.k8s.io
+- kind: Group
+  name: system:nodes
+  apiGroup: rbac.authorization.k8s.io
 roleRef:
   kind: ClusterRole
   name: system:certificates.k8s.io:certificatesigningrequests:nodeclient
diff --git a/assets/charts/control-plane/kubernetes/templates/kubelet-nodes-cluster-role-binding.yaml b/assets/charts/control-plane/kubernetes/templates/kubelet-nodes-cluster-role-binding.yaml
index 26c32c97..5dfcc170 100644
--- a/assets/charts/control-plane/kubernetes/templates/kubelet-nodes-cluster-role-binding.yaml
+++ b/assets/charts/control-plane/kubernetes/templates/kubelet-nodes-cluster-role-binding.yaml
@@ -1,4 +1,3 @@
-{{- if not .Values.kubelet.enableTLSBootstrap }}
 apiVersion: rbac.authorization.k8s.io/v1
 kind: ClusterRoleBinding
 metadata:
@@ -11,4 +10,3 @@ subjects:
 - kind: Group
   name: system:nodes
   apiGroup: rbac.authorization.k8s.io
-{{- end }}

@invidian invidian added the proposed/next-sprint Issues proposed for next sprint label Sep 23, 2020
@invidian
Copy link
Member

As runc update hit stable channel now, we can gracefully proceed with this.

@invidian invidian added the technical-debt Technical debt-related issues label Sep 23, 2020
@surajssd surajssd added this to the v0.5.0 milestone Sep 23, 2020
@surajssd surajssd removed the proposed/next-sprint Issues proposed for next sprint label Sep 23, 2020
@surajssd surajssd modified the milestones: v0.5.0, v0.6.0 Oct 12, 2020
@surajssd surajssd added the proposed/next-sprint Issues proposed for next sprint label Oct 14, 2020
@invidian invidian added size/s Issues which likely require up to a couple of work hours blocked Blocked by other issue labels Nov 3, 2020
@iaguis iaguis removed the proposed/next-sprint Issues proposed for next sprint label Nov 4, 2020
@invidian invidian removed the blocked Blocked by other issue label Nov 23, 2020
@invidian
Copy link
Member

Not sure why this was marked as blocked. I think we should be able to proceed with that.

@invidian invidian modified the milestones: v0.6.0, v0.7.0 Nov 25, 2020
@surajssd surajssd modified the milestones: v0.7.0, 0.8.0 Feb 24, 2021
@iaguis iaguis modified the milestones: v0.8.0, v0.9.0 Jun 23, 2021
@iaguis iaguis removed this from the v0.9.0 milestone Aug 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/kubernetes Core Kubernetes stuff area/stability Stability kind/cleanup Cleanup issues size/s Issues which likely require up to a couple of work hours technical-debt Technical debt-related issues
Projects
None yet
Development

No branches or pull requests

3 participants