-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for Helm CRD support #64
Conversation
fc81b66
to
2b28dfd
Compare
proposals/crd-support.md
Outdated
This is an outline for how CRDs can be handled in Helm. There are at least two ways to do this currently, but both have issues. With the implementation suggested here, CRDs are getting special treatment to make sure that they are installed correctly and are updated and deleted in a safe way. CRDs will be managed as part of a release, but will be orphaned rather than deleted when a release is removed. Similar, CRDs will be adopted into a release if they already exist. | ||
|
||
## Motivation | ||
The current ways to handle CRDs in Helm have issues and these are known to have caused real problems for users. The goal here is to handle CRDs in a way that is intuitive and flexible for chart developers, as well as easy to use and safe for helm users. The goal is that this could be introduced in Helm 2 as well as make it into Helm 3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a documentation somewhere that points to the current ways to handle CRDs in Helm? If not, can that be included here for completeness sake?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a link to an issue that has a great description of the current ways to handle CRDs with Helm and the problems with them.
2b28dfd
to
5ebf680
Compare
proposals/crd-support.md
Outdated
CRDs need to be handled correctly through all the phases of a Helm release: | ||
|
||
### Install | ||
* CRDs should be the first resource processed during installation. CRDs must be installed into the cluster before any instances of the CRD. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be: "CRDs must be installed into ... before any instances of the CR"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is a typo. I will update.
proposals/crd-support.md
Outdated
* If a chart contains one or more CRDs, the helm install process should for each of them, check if the CRD is already installed. | ||
* If it is, Helm should check if the installed CRD is identical to the one contained in the chart. | ||
* If it is, the install process can skip installing the CRD and just continue. | ||
* If it is different, the install process should fail, unless the user has explicitly confirmed through a command line flag that the CRD in the chart should overwrite the existing one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be qualified to accommodate the situation of different version CRDs installed side-by-side in a cluster? Overwrite option will preclude that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree that once versioned CRDs are a thing, it should be allowed to have multiple version of the same CRD installed at the same time. Overwrite would only apply if a chart is trying to install a CRD where the version matches the installed one. I will update the PR.
* If the CRD was already installed, then the existing CRD will be adopted into the release. | ||
|
||
### Delete | ||
* When a chart is deleted, all CRDs will be orphaned, unless the user has explicitly through a command line flag decided that it should be deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just observing that the delete flag will need to be specified per CRD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, it would require the user to specify which CRDs should be deleted in the situation where a chart contains more than one. The same applies to updates to CRDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a CLI flag to delete all CRDs and not just those specified? It would be a nice shortcut for testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am NOT a fan of having to specify individual CRD names to delete. It's tedious, and requires the user to know very deeply what is happening behind the scenes. Considering that there are Kubernetes tools that install more than a dozen CRDs, this can get pretty onerous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thought here was that this would provide a way for users to delete CRDs with a chart, but that the default would be to orphan the CRDs which is the same as the current crd_install hook. The issue here is that there are almost no scenarios when it will be safe to delete a CRD, so deleting any CRD will require some kind of confirmation from the user.
I guess it would be possible to find some way for Helm to count the number of releases that use a specific CRD, and then know that the CRD can be deleted when the last release is deleted. But it seems pretty complicated and it also wouldn't be able to track any usage of a CRD that doesn't go through Helm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CRDs are an admin task. You need high level access to even install them. Something many Helm users may not have as corporate environments are more locked down. The person installing an app/CRD and maybe later uninstalling it may likely not have cluster wide context.
I wonder if an admin for the cluster would have enough context to the workloads running in the cluster to know when to delete a CRD.
Is this lack of context, especially in large clusters, a CRD shortcoming?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @mortent that automating deletes using a Garbage collection like counting approach might be possible, but seems too complicated.
@mattfarina Isn't the 'context' that you refer to essentially the same thing as the various helm releases in which CRD is being used (from @mortent's comment).
How to use that context (whether programmatically by 'helm delete' or by admin before calling 'helm delete'), that seems to be the main problem. I would not categorize that being a shortcoming of CRD.
if you use the crd-install hook, then you can only install CRDs, and not update them at any point during a chart lifecycle. Also, prior to Helm 2.12, if you installed chart with a crd-install hook that did not have one previously, it deleted the CRDs. Therefore, removing the crd-install hook, so that CRDs are again managed by the Helm charts. Added a `agones.crd.install` parameter, in case someone wants to subchart this chart, then can set this to false, and copy the Agones CRDs into their own charts to be included in the right place for their chart lifecycle. Also, since we have a `agones.crd` config section, moved `agones.enableHelmCleanupHooks` into `agones.crds.cleanupOnDelete` Unfortunately, with this back and forth on the crd-install hook, if you are using the Helm chart, you will need to do a full Agones `helm delete --purge` and cleanup any remaining CRDs to upgrade. More context on helm + crds: - helm/helm#4697 - istio/istio#9604 - istio/istio#7688 - helm/community#64 - helm/helm#4863 - helm/helm#4709
if you use the crd-install hook, then you can only install CRDs, and not update them at any point during a chart lifecycle. Also, prior to Helm 2.12, if you installed chart with a crd-install hook that did not have one previously, it deleted the CRDs. Therefore, removing the crd-install hook, so that CRDs are again managed by the Helm charts. Added a `agones.crd.install` parameter, in case someone wants to subchart this chart, then can set this to false, and copy the Agones CRDs into their own charts to be included in the right place for their chart lifecycle. Also, since we have a `agones.crd` config section, moved `agones.enableHelmCleanupHooks` into `agones.crds.cleanupOnDelete` Unfortunately, with this back and forth on the crd-install hook, if you are using the Helm chart, you will need to do a full Agones `helm delete --purge` and cleanup any remaining CRDs to upgrade. More context on helm + crds: - helm/helm#4697 - istio/istio#9604 - istio/istio#7688 - helm/community#64 - helm/helm#4863 - helm/helm#4709
This is a problem that Helm is going to solve going forward, but for now if you use the crd-install hook, then you can only install CRDs, and not update them at any point during a chart lifecycle. Also, prior to Helm 2.12, if you installed chart with a crd-install hook that did not have one previously, it deleted the CRDs. Therefore, removing the crd-install hook, so that CRDs are again managed by the Helm charts. Added a `agones.crd.install` parameter, in case someone wants to subchart this chart, then can set this to false, and copy the Agones CRDs into their own charts to be included in the right place for their chart lifecycle. Also, since we have a `agones.crd` config section, moved `agones.enableHelmCleanupHooks` into `agones.crds.cleanupOnDelete` Unfortunately, with this back and forth on the crd-install hook, if you are using the Helm chart, you will need to do a full Agones `helm delete --purge` and cleanup any remaining CRDs to upgrade. More context on helm + crds: - helm/helm#4697 - istio/istio#9604 - istio/istio#7688 - helm/community#64 - helm/helm#4863 - helm/helm#4709
This is a problem that Helm is going to solve going forward, but for now if you use the crd-install hook, then you can only install CRDs, and not update them at any point during a chart lifecycle. Also, prior to Helm 2.12, if you installed chart with a crd-install hook that did not have one previously, it deleted the CRDs. Therefore, removing the crd-install hook, so that CRDs are again managed by the Helm charts. Added a `agones.crd.install` parameter, in case someone wants to subchart this chart, then can set this to false, and copy the Agones CRDs into their own charts to be included in the right place for their chart lifecycle. Also, since we have a `agones.crd` config section, moved `agones.enableHelmCleanupHooks` into `agones.crds.cleanupOnDelete` Unfortunately, with this back and forth on the crd-install hook, if you are using the Helm chart, you will need to do a full Agones `helm delete --purge` and cleanup any remaining CRDs to upgrade. More context on helm + crds: - helm/helm#4697 - istio/istio#9604 - istio/istio#7688 - helm/community#64 - helm/helm#4863 - helm/helm#4709
Signed-off-by: Morten Torkildsen <mortent@google.com>
5ebf680
to
de9a0d7
Compare
* If a chart contains one or more CRDs, the helm install process should for each of them, check if the CRD is already installed. | ||
* If it is, Helm should check if the installed CRD is identical to the one contained in the chart. | ||
* If it is, the install process can skip installing the CRD and just continue. | ||
* If it is different, the install process should fail, unless the user has explicitly confirmed through a command line flag that the CRD in the chart should overwrite the existing one. If a CRD is versioned, it should allow multiple different versions of the CRD to coexist in the cluster. Overwrite will only apply if a chart has a CRD with a version that already exists. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is too hard a requirement. There could be times when a crd update is purely additive. New features are being added to the crd. This won't break the interface and is sometimes preferable to making a whole new api version.
Either helm should detect this is an additive only operation an allow it by default, or have some other annotation that says that upgrading is ok. Maybe a minor version annotation or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that if there is a way to detect whether a change to a CRD would break compatibility, we could relax this requirement. But I don't think a change that is additive can always be considered safe. For example, adding a new required field in a CRD would prevent old clients from creating new CRs.
So CRDs are currently unversioned. Versioned CRDs are coming and once that happens, we should be able to keep multiple versions at the same time and in almost all cases, a change to the CRD will be a new version. Thus, old clients can keep using an old version while new clients use a new version.
* If a CRD is already installed in the cluster, Helm should check if the CRD in the updated chart is in any way different than the existing one | ||
* If it is not, then no changes are needed and the update can continue. | ||
* If the CRD in the updated chart is not identical to the existing CRD, the update process needs explicit approval from the user before updating the CRD. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, update is more complex than it appears. There's no way to infer from a CRD whether it is compatible or incompatible.
"explicit approval" implies a channel back to the user, which will be present in the CLI, but not in the Helm Controller. I think you'd have to go with an explicit decision to force all to upgrade or to not.
Again, I don't think it is tenable to require the user to decide for each CRD whether to update it. Again, it requires the user to know things that users tend not to know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is possible if the developer of the crd adds enough metadata to allow helm to know what the right thing to do is. Maybe some kind of annotation.
* If the CRD in the updated chart is not identical to the existing CRD, the update process needs explicit approval from the user before updating the CRD. | ||
|
||
### Rollback | ||
Rollback would follow the same rules as update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, if there is automatic addition of entries to the crd, then rollback would still need to be manual so the user knows to deal with the fallout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there could be situations where an upgrade of a CRD is safe but the rollback would not be safe.
* If a chart contains one or more CRDs, the helm install process should for each of them, check if the CRD is already installed. | ||
* If it is, Helm should check if the installed CRD is identical to the one contained in the chart. | ||
* If it is, the install process can skip installing the CRD and just continue. | ||
* If it is different, the install process should fail, unless the user has explicitly confirmed through a command line flag that the CRD in the chart should overwrite the existing one. If a CRD is versioned, it should allow multiple different versions of the CRD to coexist in the cluster. Overwrite will only apply if a chart has a CRD with a version that already exists. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How far down the rabbit hole do we go with rejection? For example, what if a field description changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is a way to detect breaking changes to CRDs so we could let some changes go through without user confirmation, that would be great. I'm not sure how easy that is in practice though.
There also isn't any way to know which of the two versions are the "latest" one.
* If it is different, the install process should fail, unless the user has explicitly confirmed through a command line flag that the CRD in the chart should overwrite the existing one. If a CRD is versioned, it should allow multiple different versions of the CRD to coexist in the cluster. Overwrite will only apply if a chart has a CRD with a version that already exists. | ||
* If it does not already exist, the CRD will be installed. | ||
* All CRDs in the chart will be tracked as part of a release when the chart is installed | ||
* If the CRD was already installed, then the existing CRD will be adopted into the release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if two charts install a CRD? The instance of the first would track it in an release. The second chart would install and adopt it into the release. It would then be part of multiple releases?
Should we instead treat it as a global object (which is how k8s sees it)? What benefit do we get from tracking it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the implications of adopting a CRD. It seems like this could get very confusing if a chart installs multiple CRDs. Imagine this case:
Chart A installs CRD X and CRD Y
Chart B Installs CRD X and CRD Z
According to your proposal, now both A and B "own" CRD X. And there is no way to delete CRD Y or CRD Z except by forcing a deletion of all CRDs.
So assume you do a helm delete --delete-crds
on Chart B. Now you've deleted CRD X out from under Chart A. But if you don't do it that way, there is no way to delete CRD Y.
But requiring users to explicitly indicate which CRDs they want to uninstall requires users to get REALLY familiar with their charts (which, IMO, is unfair).
Now, if you skip the adoption part, then only by uninstalling Chart A can you delete CRD X. That is at least somewhat clear, though it only sidesteps part of the problem, as deleting A with --delete-crds
would render B useless. (And yet, still, no way to delete CRD Y from A).
Adopting ANY object is very dangerous, and I suggest avoiding it as an automated behavior simply because it is counterintuitive. (Consider if your OS package manager decided to adopt parts of a program that were installed elsewhere)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I see your point here. The thought behind adopting an already existing CRD into a release was to have a way to keep track of which resources (including CRDs) that a release has created/depends on. CRDs are different than other resources, so it does get a little bit confusing.
I think the alternative to trying to keep track of which releases uses each CRD, is the path taken with the crd_install hook where CRDs are installed and "forgotten" by the release. It would have the drawback that CRDs will be left in your cluster after any charts that used it has been deleted, but that might be the most likely outcome even with this proposal.
Tracking CRDs only as part of the release that initially created it seems impractical and confusing.
* If the CRD was already installed, then the existing CRD will be adopted into the release. | ||
|
||
### Delete | ||
* When a chart is deleted, all CRDs will be orphaned, unless the user has explicitly through a command line flag decided that it should be deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a CLI flag to delete all CRDs and not just those specified? It would be a nice shortcut for testing.
|
||
### Update | ||
* Any new CRDs included in a new version of a chart will be handled in the same way that CRDs are handled in the install flow. | ||
* Any CRDs deleted in a new version of a chart will be handled in the same way that CRDs are handled in the delete flow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleting CRDs is complicated. For example, if another chart has a CR for the CRD and the CRD is deleted... the CRs in the other chart will delete. Doing the "update dance" will not be easy and people should have a deprecation period.
It would be nice, at least for a cluster admin, to see how many CRs are using a CRD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, deleting a CRD is very dangerous unless you have a good understanding of every possible use of the CRD.
Looking up all CRs using a CRD is easy with just kubectl get <name of crd>
, so maybe there can be some support in Helm to help users safely delete CRDs.
|
||
|
||
### Validation | ||
Validation is harder when there are CRDs in the chart since validation of CRs will fail if the CRD has not already been installed. The simple solution here is to keep the current approach where CRDs are installed before doing validation. As a separate effort, we could see if there is a way to do client-side validation of the chart without actually installing the CRD. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was how the hook system worked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this doesn't make any changes to how validation works compared to the crd_install hook. Being able to do more validation even when a chart uses CRDs would be helpful, but separate from this discussion.
|
||
|
||
### Hooks | ||
Hooks should run at the same time as now. Pre hooks will run after validation, but before anything is installed. Post hooks will run after all changes have been applied to the cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I don't see how this in any way differs from the current hook scenario, except that it places burden on the user instead of on the chart author, which I think is incorrect behavior
Can we get a list of the situations that an end user needs solved? What are the problem situations they run into today? From this information we can look at things we can layer in that solve those types of situations. For example:
We want to make a simple solution for those who operate and install charts. Considering the experience here. |
Also, I'd like to see at least SOME attempt at merely fixing what's broken with the hook system instead of introducing a specialized code path. |
### Install | ||
* CRDs should be the first resource processed during installation. CRDs must be installed into the cluster before any CRs belonging to the CRD. | ||
* If a chart contains one or more CRDs, the helm install process should for each of them, check if the CRD is already installed. | ||
* If it is, Helm should check if the installed CRD is identical to the one contained in the chart. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm working with the install case and attempting to figure out the best way to determine if something is identical. Ideally, we can do idempotent installs. To do a comparison between what we have from the chart and what the server knows I'm noticing we would need to:
- drop the status from the response from the server as well as some fields (e.g.,
metadata.uuid
) - Walk the whole object and do a comparison because ordering is different so you can't hash
There are still issues because:
- Something may add a field that is ok for an end user to also add. For example, in a test I just ran something added to the
spec.additionalPrinterColumns
that was not in the local copy (no mutating webhooks installed either). Since this is in beta there is a good chance these changes may continue to happen. From experience we know that some changes continue to happen even after GA. - How do we handle mutating webhooks that can alter objects when they come in
- Some tools apply their own annotations. Installing cert-manater tells you to use kubectl to add the CRDs. kubectl in tern adds some configuration (
last-applied-configuration
) as an annotation
How can we identify if the CRD has changed with so many outside sources having touched it? I'm open to suggestions.
This is an outline for how CRDs can be handled in Helm. There are at least two ways to do this currently, but both have issues. With the implementation suggested here, CRDs are getting special treatment to make sure that they are installed correctly and are updated and deleted in a safe way. CRDs will be managed as part of a release, but will be orphaned rather than deleted when a release is removed. Similar, CRDs will be adopted into a release if they already exist. | ||
|
||
## Motivation | ||
The current ways to handle CRDs in Helm have issues and these are known to have caused real problems for users (more details in [#4863](https://github.com/helm/helm/issues/4863)). The goal here is to handle CRDs in a way that is intuitive and flexible for chart developers, as well as easy to use and safe for helm users. The goal is that this could be introduced in Helm 2 as well as make it into Helm 3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to provide summary of the issues here itself? That way, this proposal will be self-contained. We can point the reader to 4863 to find out more details about the issues.
Hi @mortent! Helm 3 has since been released, and this conversation is being carried forward in helm/helm#5871. Therefore I'm going ahead and closing this. Please feel free to carry on this conversation over in that ticket, as there's been some ongoing discussion since this proposal was first brought up. Thanks! |
Signed-off-by: Morten Torkildsen mortent@google.com