Skip to content
This repository has been archived by the owner on Jun 26, 2023. It is now read-only.

HNC: Deleting a subnamespace after creating resources in the parent namespace results in error #1130

Closed
tomersolomon1 opened this issue Sep 23, 2020 · 8 comments
Milestone

Comments

@tomersolomon1
Copy link

tomersolomon1 commented Sep 23, 2020

The following sequence doesn't work:

  1. Creating a subnamespace by creating a SubnamespaceAnchor resource (the name of the parent ns is hnc-try, and the name of the subnamespace is hnc-try-sub-1).
  2. Creating some RBAC resources in the parent namespace (role + rolebinding).
  3. Deleting the SubnamespaceAnchor resource.

The subnamespace gets stuck in the middle of deletion, and I see the following log:

{"level":"info","ts":1600856582.4537065,"logger":"validators.Object","msg":"Handled","nm":"","resource":{"group":"rbac.authorization.k8s.io","version":"v1","resource":"roles"},"nnm":"hnc-try-sub-1","op":"DELETE","allowed":false,"code":403,"reason":"Forbidden","message":"Cannot delete object propagated from namespace \"hnc-try\""}

By adding some logging to the validator, I got that the username which issued the AdmissionRequest is system:serviceaccount:kube-system:namespace-controller. However, this user isn't exempted from the webhook, and only system:serviceaccounts:hnc-system is exempted. So the AdmissionRequest isn't automatically allowed, and then it is rejected as the resource has the label hnc.x-k8s.io/inheritedFrom: hnc-try (hnc-try is the name of the parent namespace).

I would suggest to exclude the same service accounts as atlasian excludes:
https://github.com/atlassian/voyager/blob/11ade5a733008770a7df4b12aee2399dc2af5d12/pkg/admission/server.go#L61

Environment

Kind version:

kind v0.7.0 go1.13.6 linux/amd64

Kubernetes version:

Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.0", GitCommit:"70132b0f130acc0bed193d9ba59dd186f0e634cf", GitTreeState:"clean", BuildDate:"2020-01-14T00:09:19Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"}

Operator version - v0.5.2

@tomersolomon1 tomersolomon1 changed the title Deleting a subnamespace after creating resources in the parent namespace results in error HNC: Deleting a subnamespace after creating resources in the parent namespace results in error Sep 23, 2020
@adrianludwin
Copy link
Contributor

Thanks for the bug report. I'm looking at this now at high priority (I have an idea what the cause is) and hopefully can push out a release with a fix tonight or tomorrow.

@adrianludwin adrianludwin added this to the hnc-v0.5.3 milestone Sep 23, 2020
@adrianludwin
Copy link
Contributor

adrianludwin commented Sep 23, 2020

Whoops, I missed the fact that you identified exactly what the cause is :) That seems like a good solution, I'll go with that. Thanks!

@adrianludwin
Copy link
Contributor

This doesn't appear to affect full namespaces. The reason seems to be that as the K8s controller deletes all the objects in the namespace, it also deletes the HierarchyConfiguration object, which orphans the namespace and hence deletes all propagated objects.

But for subnamespaces, even though the HierarchyConfiguration object is deleted, HNC knows right until the end that the subnamespace is a child of the parent, and so it never allows the object to be deleted.

@adrianludwin
Copy link
Contributor

Reproduced this with a simple e2e test

@adrianludwin
Copy link
Contributor

A short-term workaround is to remove the subnamespaceOf annotation in the subnamespace prior to deleting it directly. However, this requires higher privileges than should be required to deal with namespaces.

adrianludwin added a commit to adrianludwin/multi-tenancy that referenced this issue Sep 23, 2020
See issue kubernetes-retired#1130. The HNC object validator was preventing the K8s
namespace controller from cleaning out a namespace to allow it to be
deleted. This change lets HNC understand that when a namespace is being
deleted, we should ignore the subnamespaceOf annotation. This allows the
HierarchyConfiguration object to be reset to its initial state (like a
full namespace would when it's being deleted), orphaning the namespace
and allowing all its propagated objects to be removed.

Tested: added new tests and ran them multiple times while watching the
logs. The subnamespace test failed without this change and passed every
time with it, with the logs showing the expected behaviour.
@adrianludwin
Copy link
Contributor

I've found a cleaner and more minimal fix for v0.5 (#1139), triggered by the fact that this only affected subnamespaces and full namespaces. Basically, we were preventing subnamespaces from being orphaned as K8s deleted their hierarchical configuration; by allowing them to be orphaned, all objects are removed naturally.

With that said, it's generally a good idea not to fight the K8s controllers (it's so annoying when you can't finish deleting a namespace), so I've filed #1140 to address this.

adrianludwin added a commit to adrianludwin/multi-tenancy that referenced this issue Sep 23, 2020
This is the cherrypick from v0.5 to master. The original commit message
is shown below.

Tested: new e2e tests pass; made a slight change to make them more
robust.

---

See issue kubernetes-retired#1130. The HNC object validator was preventing the K8s
namespace controller from cleaning out a namespace to allow it to be
deleted. This change lets HNC understand that when a namespace is being
deleted, we should ignore the subnamespaceOf annotation. This allows the
HierarchyConfiguration object to be reset to its initial state (like a
full namespace would when it's being deleted), orphaning the namespace
and allowing all its propagated objects to be removed.

Tested: added new tests and ran them multiple times while watching the
logs. The subnamespace test failed without this change and passed every
time with it, with the logs showing the expected behaviour.
@adrianludwin
Copy link
Contributor

Fixed in v0.5 and master.

/close

@k8s-ci-robot
Copy link
Contributor

@adrianludwin: Closing this issue.

In response to this:

Fixed in v0.5 and master.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants