Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

Prevent redundant Chart releases during helm-operator startup #1040

Closed
tamarakaufler opened this issue Apr 10, 2018 · 5 comments
Closed

Prevent redundant Chart releases during helm-operator startup #1040

tamarakaufler opened this issue Apr 10, 2018 · 5 comments

Comments

@tamarakaufler
Copy link
Contributor

Problem

Currently, when the helm-operator restarts, all Chart releases within the cluster are, possibly redundantly, deployed.

A/C

When the helm-operator starts, only necessary release upgrades should be performed.

@stefanprodan
Copy link
Member

stefanprodan commented Jul 4, 2018

@squaremo for Helm charts with hooks this is a big deal. For example on charts that use pre-install and post-install hooks to run migrations, clustering init, etc this behaviour could potentially break things.

@squaremo squaremo added bug and removed enhancement labels Jul 12, 2018
@squaremo
Copy link
Member

@stefanprodan I don't think anyone should be relying on Helm's hooks being run at exactly the right times, or exactly the right number of times, irrespective of flux or the flux helm operator.

In any case, we should try to avoid doing extra work, and making the situation worse, at least. After #1240 is merged, the helm op will

  1. install when a FluxHelmRelease is created and upgrade when a FluxHelmRelease is changed
  2. upgrade when the chart is changed in git
  3. periodically,
    a) install when there's a FluxHelmRelease without a corresponding release
    b) upgrade when a dry-run with the current git contents differs from the current contents

1.) is driven by the cluster observer (in operator.go), i.e., it reacts to the resources changing, and effectively runs straight away. When starting, any existing FluxHelmRelease resources are considered as being new, and will be updated -- it's likely this that is considered the redundant update.

One way to avoid it would be to run the observer-driven releases through the same reconciliation code as 3), i.e., to install if missing and upgrade only if it would make a difference.

@squaremo
Copy link
Member

Yep; here's a log from helm-operator as it starts up, after nothing in the cluster or in git has changed:

ERROR: logging before flag.Parse: W0725 12:31:01.919884       5 client_config.go:552] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
ts=2018-07-25T12:31:01.930298043Z caller=helm.go:81 component=helm info="Helm client set up"
ts=2018-07-25T12:31:01.930358303Z caller=main.go:178 component=helm-operator info="Attempting to clone repo ..." url=ssh://git@github.com/squaremo/flux-helm-test
ts=2018-07-25T12:31:07.041475861Z caller=main.go:186 component=helm-operator info="Repo cloned" url=ssh://git@github.com/squaremo/flux-helm-test
ts=2018-07-25T12:31:07.041661579Z caller=chartsync.go:96 component=chartsync info="Starting charts sync loop"
ts=2018-07-25T12:31:07.042540352Z caller=operator.go:100 component=operator info="Setting up event handlers"
ts=2018-07-25T12:31:07.042650283Z caller=operator.go:122 component=operator info="Event handlers set up"
ts=2018-07-25T12:31:07.042926983Z caller=operator.go:135 component=operator info="Starting operator"
ts=2018-07-25T12:31:07.043104097Z caller=operator.go:137 component=operator info="Waiting for informer caches to sync"
ts=2018-07-25T12:31:07.051707066Z caller=operator.go:105 component=operator info="CREATING release"
ts=2018-07-25T12:31:07.054372798Z caller=operator.go:106 component=operator info="Custom Resource driven release install"
ts=2018-07-25T12:31:07.056908181Z caller=operator.go:105 component=operator info="CREATING release"
ts=2018-07-25T12:31:07.057048415Z caller=operator.go:106 component=operator info="Custom Resource driven release install"
ts=2018-07-25T12:31:07.057610683Z caller=operator.go:105 component=operator info="CREATING release"
ts=2018-07-25T12:31:07.057770786Z caller=operator.go:106 component=operator info="Custom Resource driven release install"
ts=2018-07-25T12:31:07.062780021Z caller=chartsync.go:125 component=chartsync info="no new commits on branch" branch=master head=9d29fc2ffcd264526b64399c69fc2f32a018d531
ts=2018-07-25T12:31:07.143406966Z caller=operator.go:142 component=operator info="Informer caches synced"
ts=2018-07-25T12:31:07.143650506Z caller=operator.go:144 component=operator info="Starting workers"
ts=2018-07-25T12:31:07.14383015Z caller=operator.go:170 component=operator debug="Processing next work queue job ..."
ts=2018-07-25T12:31:07.1440401Z caller=operator.go:173 component=operator debug="PROCESSING item [\"default/frontend\"]"
ts=2018-07-25T12:31:07.144152403Z caller=operator.go:230 component=operator debug="Starting to sync cache key default/frontend"
ts=2018-07-25T12:31:07.144849733Z caller=operator.go:170 component=operator debug="Processing next work queue job ..."
ts=2018-07-25T12:31:07.1456772Z caller=operator.go:173 component=operator debug="PROCESSING item [\"default/mariadb\"]"
ts=2018-07-25T12:31:07.14615505Z caller=operator.go:230 component=operator debug="Starting to sync cache key default/mariadb"
ts=2018-07-25T12:31:07.197915908Z caller=release.go:142 component=release info="releaseName= default-frontend, action=UPDATE, install options: {DryRun:false ReuseName:false}"
ts=2018-07-25T12:31:07.211273286Z caller=release.go:142 component=release info="releaseName= default-mariadb, action=UPDATE, install options: {DryRun:false ReuseName:false}"
ERROR: logging before flag.Parse: I0725 12:31:07.889620       5 event.go:221] Event(v1.ObjectReference{Kind:"FluxHelmRelease", Namespace:"default", Name:"frontend", UID:"9d2a9b1e-8ff3-11e8-a77a-b09e655d7541", APIVersion:"helm.integrations.flux.weave.works/v1alpha2", ResourceVersion:"189071", FieldPath:""}): type: 'Normal' reason: 'ChartSynced' Chart managed by FluxHelmRelease processed successfully
ts=2018-07-25T12:31:07.892836751Z caller=operator.go:215 component=operator info="Successfully synced 'default/frontend'"
ts=2018-07-25T12:31:07.893040627Z caller=operator.go:170 component=operator debug="Processing next work queue job ..."
ts=2018-07-25T12:31:07.893122896Z caller=operator.go:173 component=operator debug="PROCESSING item [\"default/backend\"]"
ts=2018-07-25T12:31:07.893264063Z caller=operator.go:230 component=operator debug="Starting to sync cache key default/backend"
ts=2018-07-25T12:31:07.930948852Z caller=operator.go:215 component=operator info="Successfully synced 'default/mariadb'"
ts=2018-07-25T12:31:07.930993177Z caller=operator.go:170 component=operator debug="Processing next work queue job ..."
ERROR: logging before flag.Parse: I0725 12:31:07.931240       5 event.go:221] Event(v1.ObjectReference{Kind:"FluxHelmRelease", Namespace:"default", Name:"mariadb", UID:"84233afa-8e9e-11e8-a77a-b09e655d7541", APIVersion:"helm.integrations.flux.weave.works/v1alpha2", ResourceVersion:"172375", FieldPath:""}): type: 'Normal' reason: 'ChartSynced' Chart managed by FluxHelmRelease processed successfully
ts=2018-07-25T12:31:07.938724807Z caller=release.go:142 component=release info="releaseName= backend, action=UPDATE, install options: {DryRun:false ReuseName:false}"
ERROR: logging before flag.Parse: I0725 12:31:08.889915       5 event.go:221] Event(v1.ObjectReference{Kind:"FluxHelmRelease", Namespace:"default", Name:"backend", UID:"9d295266-8ff3-11e8-a77a-b09e655d7541", APIVersion:"helm.integrations.flux.weave.works/v1alpha2", ResourceVersion:"189357", FieldPath:""}): type: 'Normal' reason: 'ChartSynced' Chart managed by FluxHelmRelease processed successfully
ts=2018-07-25T12:31:08.896665723Z caller=operator.go:215 component=operator info="Successfully synced 'default/backend'"
ts=2018-07-25T12:31:08.896975009Z caller=operator.go:170 component=operator debug="Processing next work queue job ..."

There's a bunch of releases done there that aren't strictly necessary, triggered by the Add handler being called.

@stefanprodan
Copy link
Member

I think the CR object has a timestamp for when it was last updated, maybe we could use that at startup to avoid this.

@squaremo
Copy link
Member

One way to avoid it would be to run the observer-driven releases through the same reconciliation code as 3), i.e., to install if missing and upgrade only if it would make a difference.

After #1254, helm-op always checks to see whether a release will change anything. That avoids the redundant releases mentioned here.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants