Skip to content

Commit

Permalink
Replace excluded namespace label with included
Browse files Browse the repository at this point in the history
Part of 9
Next PR would be updating the label concept doc.

Missing `excluded-namespace` label on system namespaces would break the
entire cluster. Instead of always trying hard to make sure the
`excluded-namespace` label are set, this PR replaces it with included
namespace concept (`included-namespace` label added by HC reconciler on
all namespaces except those listed as excluded in the HNC container
args).

Remove the `excluded-namespace` label. Add the `included-namespace` label.
Update HC reonciler to always set the `included-namespace` label to true
on non-excluded namespaces and remove it on excluded namespaces. Update
namespace webhook to block adding/updating the new label improperly.

Update the Makefile to remove the `excluded-namespace` label setting.

Tested by `make test` and `make test-e2e`.
  • Loading branch information
yiqigao217 committed May 14, 2021
1 parent 467d883 commit d11d38e
Show file tree
Hide file tree
Showing 8 changed files with 171 additions and 118 deletions.
39 changes: 5 additions & 34 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -180,23 +180,11 @@ controller-gen:

# Deploy controller in the configured Kubernetes cluster in ~/.kube/config.
#
# We only delete the deployment and the validatingwebhookconfiguration if they
# exist before applying the manifest, because
# We only delete the deployment if it exists before applying the manifest, because
# a) deleting the CRDs will cause all the existing CRs to be wiped away;
# b) if not deleting the deployment, a new image won't be pulled unless the tag changes;
# c) if not deleting the validatingwebhookconfiguration, we cannot label
# namespaces to exclude them if the HNC pod is already in a bad state.
#
# Then we ensure the system namespaces are excluded before we deploy HNC. This
# step is critical because if the HNC pod is ever in a bad state, the object
# webhook service would not respond and would stop everything in the system
# namespaces, such as "kube-system", thus breaking the whole cluster.
# b) if not deleting the deployment, a new image won't be pulled unless the tag changes.
deploy: docker-push kubectl manifests
-kubectl -n hnc-system delete deployment hnc-controller-manager
-kubectl delete validatingwebhookconfiguration hnc-validating-webhook-configuration
-kubectl label ns kube-node-lease hnc.x-k8s.io/excluded-namespace=true --overwrite
-kubectl label ns kube-public hnc.x-k8s.io/excluded-namespace=true --overwrite
-kubectl label ns kube-system hnc.x-k8s.io/excluded-namespace=true --overwrite
kubectl apply -f manifests/${HNC_IMG_NAME}.yaml

deploy-watch:
Expand Down Expand Up @@ -263,7 +251,7 @@ kind-deploy:
# HNC_FOCUS=772 make test-e2e # only runs the test for issue 772
# HNC_FOCUS=Quickstart make test-e2e # Runs all tests in the "Quickstart" Describe block
.PHONY: test-e2e
test-e2e: warn-hnc-repair exclude-system-namespaces
test-e2e: warn-hnc-repair
go clean -testcache
ifndef HNC_FOCUS
go test -v -timeout 0 ./test/e2e/...
Expand All @@ -274,7 +262,7 @@ endif
# This batch test will run e2e tests N times on the current cluster the user
# deployed (either kind or a kubernetes cluster), e.g. "make test-e2e-batch N=10"
.PHONY: test-e2e-batch
test-e2e-batch: warn-hnc-repair exclude-system-namespaces
test-e2e-batch: warn-hnc-repair
number=1 ; while [[ $$number -le $N ]] ; do \
echo $$number ; \
((number = number + 1)) ; \
Expand All @@ -285,27 +273,10 @@ test-e2e-batch: warn-hnc-repair exclude-system-namespaces
# This will *only* run a small number of tests (specifically, the quickstarts). You can run this when you're fairly confident that
# everything will work, but be sure to watch for the postsubmits and periodic tests to ensure they pass too.
.PHONY: test-smoke
test-smoke: exclude-system-namespaces
test-smoke:
go clean --testcache
go test -v -timeout 0 ./test/e2e/... -args --ginkgo.focus "Quickstart"

# exclude-system-namespaces is called before we run any e2e tests. We do ensure
# the system namespaces are excluded in the "deploy" target. However, we need to
# do it here too in case users install HNC by applying manifests. Ensuring the
# system namespaces excluded is critical, because otherwise when HNC pod is in a
# bad state, the whole cluster will break.
exclude-system-namespaces:
@echo
@echo "Ensuring all system namespaces are excluded from HNC..."
@kubectl label ns hnc-system hnc.x-k8s.io/excluded-namespace=true --overwrite
@kubectl label ns kube-node-lease hnc.x-k8s.io/excluded-namespace=true --overwrite
@kubectl label ns kube-public hnc.x-k8s.io/excluded-namespace=true --overwrite
@kubectl label ns kube-system hnc.x-k8s.io/excluded-namespace=true --overwrite
@echo
@echo "If these tests fail due to the webhook not being ready, wait 30s and try again. Note that webhooks can take up to 30s to become ready"
@echo "after HNC is first deployed in a cluster."
@echo

warn-hnc-repair:
@echo "************************************************************"
ifndef HNC_REPAIR
Expand Down
6 changes: 3 additions & 3 deletions api/v1alpha2/hierarchy_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,9 @@ const (
// about this standard label when we invented our own).
LabelManagedByApps = "app.kubernetes.io/managed-by"

// LabelExcludedNamespace is the label added by users on the namespaces that
// should be excluded from our validators, e.g. "kube-system".
LabelExcludedNamespace = MetaGroup + "/excluded-namespace"
// LabelIncludedNamespace is the label added by HNC on the namespaces that
// should be enforced by our validators.
LabelIncludedNamespace = MetaGroup + "/included-namespace"
)

const (
Expand Down
6 changes: 0 additions & 6 deletions config/manager/manager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,6 @@ kind: Namespace
metadata:
labels:
control-plane: controller-manager
# Add excluded namespace label on `hnc-system` namespace by default because
# without this label when installing HNC, there will be a deadlock that
# the object webhook fails close so the cert-rotator cannot create/update
# `hnc-webhook-server-cert` secret object for VWHConfiguration, thus the
# webhooks will never be ready.
hnc.x-k8s.io/excluded-namespace: "true"
name: system
---
apiVersion: apps/v1
Expand Down
20 changes: 10 additions & 10 deletions config/webhook/webhook_patch.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,19 +7,19 @@ webhooks:
- name: objects.hnc.x-k8s.io
timeoutSeconds: 2
sideEffects: None
# We only filter out excluded namespaces from this object validator so that
# when HNC (webhook service specifically) is down, operations in the excluded
# We only apply this object validator on non-excluded namespaces, which have
# the "included-namespace" label set by the HC reconciler, so that when HNC
# (webhook service specifically) is down, operations in the excluded
# namespaces won't be affected. Validators on HNC CRs are not filtered because
# they are supposed to prevent abuse of HNC CRs in excluded namespaces.
# Namespace validator is not filtered to prevent abuse of excluded namespace.
# Unfortunately, this means that when HNC is down, we will block updates on all
# namespaces, even "excluded" ones, but anyone who can update namespaces like
# `kube-system` should likely be able to delete the VWHConfiguration to make
# the updates.
# Namespace validator is not filtered to prevent abuse of the included-namespace
# label on excluded namespaces. Unfortunately, this means that when HNC is
# down, we will block updates on all namespaces, even "excluded" ones, but
# anyone who can update namespaces like `kube-system` should likely be able to
# delete the VWHConfiguration to make the updates.
namespaceSelector:
matchExpressions:
- key: hnc.x-k8s.io/excluded-namespace
operator: DoesNotExist
matchLabels:
hnc.x-k8s.io/included-namespace: "true"
rules:
# This overwrites the rules specified in the object validator to patch object
# scope of `namespaced` since there's no kubebuilder marker for `scope`.
Expand Down
68 changes: 54 additions & 14 deletions internal/reconcilers/hierarchy_config.go
Original file line number Diff line number Diff line change
Expand Up @@ -88,13 +88,9 @@ func (r *HierarchyConfigReconciler) Reconcile(ctx context.Context, req ctrl.Requ
ns := req.NamespacedName.Namespace
log := loggerWithRID(r.Log).WithValues("ns", ns)

// Always delete hierarchyconfiguration (and any other HNC CRs) in the
// excluded namespaces and early exit.
// Early exit if it's an excluded namespace.
if config.ExcludedNamespaces[ns] {
// Since singletons in the excluded namespaces are never synced by HNC, there
// are no finalizers on the singletons that we can delete them without
// removing the finalizers first.
return ctrl.Result{}, r.deleteSingletonIfExists(ctx, log, ns)
return ctrl.Result{}, r.handleExcludedNamespace(ctx, log, ns)
}

stats.StartHierConfigReconcile()
Expand All @@ -103,6 +99,35 @@ func (r *HierarchyConfigReconciler) Reconcile(ctx context.Context, req ctrl.Requ
return ctrl.Result{}, r.reconcile(ctx, log, ns)
}

func (r *HierarchyConfigReconciler) handleExcludedNamespace(ctx context.Context, log logr.Logger, nm string) error {
// Get the namespace. Early exist if the namespace doesn't exist or is purged.
// If so, there must be no namespace label or HC instance to delete.
nsInst, err := r.getNamespace(ctx, nm)
if err != nil {
if apierrors.IsNotFound(err) {
return nil
}
return err
}

// Remove the "included-namespace" label on excluded namespace if it exists.
origNS := nsInst.DeepCopy()
r.removeIncludedNamespaceLabel(log, nsInst)
if _, err := r.writeNamespace(ctx, log, origNS, nsInst); err != nil {
return err
}

// Always delete hierarchyconfiguration (and any other HNC CRs) in the
// excluded namespaces. Note: since singletons in the excluded namespaces are
// never synced by HNC, there are no finalizers on the singletons that we can
// delete them without removing the finalizers first.
if err := r.deleteSingletonIfExists(ctx, log, nm); err != nil {
return err
}

return nil
}

func (r *HierarchyConfigReconciler) reconcile(ctx context.Context, log logr.Logger, nm string) error {
// Load the namespace and make a copy
nsInst, err := r.getNamespace(ctx, nm)
Expand All @@ -117,9 +142,9 @@ func (r *HierarchyConfigReconciler) reconcile(ctx context.Context, log logr.Logg
}
origNS := nsInst.DeepCopy()

// Remove excluded namespace label if there's one on the namespace since
// the excluded namespaces should be already skipped earlier.
r.removeExcludedNamespaceLabel(log, nsInst)
// Add the "included-namespace" label to the namespace if it doesn't exist. The
// excluded namespaces should be already skipped earlier.
r.addIncludedNamespaceLabel(log, nsInst)

// Get singleton from apiserver. If it doesn't exist, initialize one.
inst, deletingCRD, err := r.getSingleton(ctx, nm)
Expand Down Expand Up @@ -178,14 +203,29 @@ func (r *HierarchyConfigReconciler) onMissingNamespace(log logr.Logger, nm strin
}
}

// removeExcludedNamespaceLabel removes the `hnc.x-k8s.io/excluded-namespace`
// removeIncludedNamespaceLabel removes the `hnc.x-k8s.io/included-namespace`
// label from the namespace.
func (r *HierarchyConfigReconciler) removeExcludedNamespaceLabel(log logr.Logger, nsInst *corev1.Namespace) {
func (r *HierarchyConfigReconciler) removeIncludedNamespaceLabel(log logr.Logger, nsInst *corev1.Namespace) {
lbs := nsInst.Labels
if _, found := lbs[api.LabelExcludedNamespace]; found {
log.Info("Illegal excluded-namespace label found; removing")
delete(lbs, api.LabelExcludedNamespace)
if _, found := lbs[api.LabelIncludedNamespace]; found {
log.Info("Illegal included-namespace label found; removing")
delete(lbs, api.LabelIncludedNamespace)
nsInst.SetLabels(lbs)
}
}

// addIncludedNamespaceLabel adds the `hnc.x-k8s.io/included-namespace` label to the
// namespace.
func (r *HierarchyConfigReconciler) addIncludedNamespaceLabel(log logr.Logger, nsInst *corev1.Namespace) {
lbs := nsInst.Labels
if lbs == nil {
lbs = make(map[string]string)
}
if v, found := lbs[api.LabelIncludedNamespace]; found && v == "true" {
return
}
log.Info("Adding included-namespace label")
lbs[api.LabelIncludedNamespace] = "true"
nsInst.SetLabels(lbs)
}

Expand Down
35 changes: 20 additions & 15 deletions internal/reconcilers/hierarchy_config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -381,26 +381,31 @@ var _ = Describe("Hierarchy", func() {
Eventually(getLabel(ctx, nms[5], nms[0]+api.LabelTreeDepthSuffix)).Should(Equal("3"))
})

It("should remove excluded namespace labels from non-excluded namespaces", func() {
It("should remove included-namespace namespace labels from excluded namespaces", func() {
config.ExcludedNamespaces = map[string]bool{"kube-system": true}
l := map[string]string{api.LabelExcludedNamespace: "true"}

// Set excluded namespace labels on foo and bar. We are not verifying the
// labels are added here because our reconciler will remove it and the
// verification would be flaky.
fooName := createNSWithLabel(ctx, "foo", l)
barName := createNSWithLabel(ctx, "bar", l)

kubeSystem := getNamespace(ctx, "kube-system")

// Add additional label "other:other" to verify the labels are updated.
l := map[string]string{api.LabelIncludedNamespace: "true", "other": "other"}
kubeSystem.SetLabels(l)
updateNamespace(ctx, kubeSystem)
Eventually(getLabel(ctx, "kube-system", api.LabelExcludedNamespace)).Should(Equal("true"))
// Verify the labels are updated on the namespace.
Eventually(getLabel(ctx, "kube-system", "other")).Should(Equal("other"))
// Verify the included-namespace label is removed by the HC reconciler.
Eventually(getLabel(ctx, "kube-system", api.LabelIncludedNamespace)).Should(Equal(""))
})

It("should set included-namespace namespace labels properly on non-excluded namespaces", func() {
// Create a namespace without any labels.
fooName := createNS(ctx, "foo")
// Verify the label is eventually added by the HC reconciler.
Eventually(getLabel(ctx, fooName, api.LabelIncludedNamespace)).Should(Equal("true"))

// Verify the excluded namespace label is removed from both foo and bar, but
// is not removed from kube-system.
Eventually(getLabel(ctx, fooName, api.LabelExcludedNamespace)).Should(Equal(""))
Eventually(getLabel(ctx, barName, api.LabelExcludedNamespace)).Should(Equal(""))
Eventually(getLabel(ctx, "kube-system", api.LabelExcludedNamespace)).Should(Equal("true"))
l := map[string]string{api.LabelIncludedNamespace: "false"}
// Create a namespace with the label with a wrong value.
barName := createNSWithLabel(ctx, "bar", l)
// Verify the label is eventually updated to have the right value.
Eventually(getLabel(ctx, barName, api.LabelIncludedNamespace)).Should(Equal("true"))
})
})

Expand Down
56 changes: 41 additions & 15 deletions internal/validators/namespace.go
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ func (v *Namespace) handle(req *nsRequest) admission.Response {

switch req.op {
case k8sadm.Create:
if rsp := v.illegalExcludedNamespaceLabel(req); !rsp.Allowed {
if rsp := v.illegalIncludedNamespaceLabel(req); !rsp.Allowed {
return rsp
}
// This check only applies to the Create operation since namespace name
Expand All @@ -85,7 +85,7 @@ func (v *Namespace) handle(req *nsRequest) admission.Response {
return rsp
}
case k8sadm.Update:
if rsp := v.illegalExcludedNamespaceLabel(req); !rsp.Allowed {
if rsp := v.illegalIncludedNamespaceLabel(req); !rsp.Allowed {
return rsp
}
// This check only applies to the Update operation. Creating a namespace
Expand All @@ -106,20 +106,46 @@ func (v *Namespace) handle(req *nsRequest) admission.Response {
return allow("")
}

func (v *Namespace) illegalExcludedNamespaceLabel(req *nsRequest) admission.Response {
for l := range req.ns.Labels {
if l == api.LabelExcludedNamespace && !config.ExcludedNamespaces[req.ns.Name] {
// Note: this only blocks the request if it has a newly added illegal
// excluded-namespace label because existing illegal excluded-namespace
// label should have already been removed by our reconciler. For example,
// even when the VWHConfiguration is removed, adding the label to a non-
// excluded namespace would pass but the label is immediately removed; when
// the VWHConfiguration is there but the reconcilers are down, any request
// gets denied anyway.
msg := fmt.Sprintf("You cannot exclude this namespace using the %q label. See https://github.com/kubernetes-sigs/hierarchical-namespaces/blob/master/docs/user-guide/concepts.md#excluded-namespace-label for detail.", api.LabelExcludedNamespace)
return deny(metav1.StatusReasonForbidden, msg)
}
// illegalIncludedNamespaceLabel checks if there's any illegal use of the
// included-namespace label on namespaces.
// TODO these validating rules will be removed and converted into a mutating
// webhook. See https://github.com/kubernetes-sigs/hierarchical-namespaces/issues/37
func (v *Namespace) illegalIncludedNamespaceLabel(req *nsRequest) admission.Response {
labelValue, hasLabel := req.ns.Labels[api.LabelIncludedNamespace]
isExcluded := config.ExcludedNamespaces[req.ns.Name]

// An excluded namespace should not have the included-namespace label.
//
// Note: this only blocks the request if the excluded namespace is newly
// created or is updated with a newly added included-namespace label because
// existing illegal included-namespace label should have already been removed
// by our reconciler. For example, even when the VWHConfiguration is removed,
// adding the label to an excluded namespace would pass but the label is
// immediately removed; when the VWHConfiguration is there but the reconcilers
// are down, any request gets denied anyway.
if isExcluded && hasLabel {
// Todo add label concept and add `See https://github.com/kubernetes-sigs/hierarchical-namespaces/blob/master/docs/user-guide/concepts.md#included-namespace-label for detail`
// to the message below. Github issue - https://github.com/kubernetes-sigs/hierarchical-namespaces/issues/9
msg := fmt.Sprintf("You cannot enforce webhook rules on this excluded namespace using the %q label.", api.LabelIncludedNamespace)
return deny(metav1.StatusReasonForbidden, msg)
}

// An included-namespace should not have the included-namespace label with a
// value other than "true" at any time. We will allow updating an included
// namespace by removing its included-namespace label, since people or system
// may apply namespace manifests (without the label) automatically and it
// doesn't make sense to block their attempts after the first time.
//
// Note: this only blocks the request if the included namespace is just
// created or updated with a wrong value for the included-namespace label,
// because our reconciler should already set it correctly before the request.
if !isExcluded && hasLabel && labelValue != "true" {
// Todo add label concept and add `See https://github.com/kubernetes-sigs/hierarchical-namespaces/blob/master/docs/user-guide/concepts.md#included-namespace-label for detail`
// to the message below. Github issue - https://github.com/kubernetes-sigs/hierarchical-namespaces/issues/9
msg := fmt.Sprintf("You cannot unset the value of the %q label. Only HNC can edit this label on namespaces.", api.LabelIncludedNamespace)
return deny(metav1.StatusReasonForbidden, msg)
}

return allow("")
}

Expand Down
Loading

0 comments on commit d11d38e

Please sign in to comment.