Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continuous Delivery via OLM #742

Closed
dgoodwin opened this issue Mar 6, 2019 · 3 comments
Closed

Continuous Delivery via OLM #742

dgoodwin opened this issue Mar 6, 2019 · 3 comments

Comments

@dgoodwin
Copy link

dgoodwin commented Mar 6, 2019

We were working on a system to deliver Hive builds continuously to stage, and ultimately prod environments that would look as follows:

  1. Job runs on every merge to master.
  2. Checkout hive.git master.
  3. Build an image and publish to registry with git shorthash as the version. (i.e. quay.io/dgoodwin/hive:5f9f2d45)
  4. Run script to generate operator bundle:
  5. Copy CRD definitions in.
  6. Generate CSV from a template:
    1. Extract RBAC rules from our normal Role yaml files and embed.
    1. Extract Deployment spec from our normal Deployment yaml file and embed.
    1. Set version to git shorthash.
    1. Replace image with new image built above.
    1. Update CSV name to hive-operator-shorthash.
    1. Update package file to reference above CSV name.

We would then want the stage environment to automatically install new releases as they became available. Prod would possibly be manually triggered, but it's worth noting that not every iteration that goes to stage would go to prod.

We do not know anything about past CSVs or what version we're replacing.

It looks like there are several problems we're going to hit with this.

  1. To upgrade, we need all previous CSV versions in the bundle. Under the above scheme there is no persistent data store for CSVs.
  2. We do not know what shorthash the previous build was, more state that would need to be maintained somehow.
  3. If a CSV can only reference one version it replaces, and not all versions go to prod, it sounds like prod might easily become un-upgradable. Stage went from A -> B -> C -> D, Prod is still in A, D is deemed ready for prod but it replaces C, not A.

Is what we're trying to accomplish possible today? Are there better approaches or workarounds?

@dgoodwin
Copy link
Author

dgoodwin commented Mar 7, 2019

Thinking about this some more, we could use the registry itself as the persistent data store.

During CD jobs we could

  • generate new CSV and copy CRDs into new bundle dir.
  • lookup latest bundle release.
    • we will need to bump it for the publishing.
    • parallel processes may be trying to do this, when the second one goes to publish new bundle it will fail as the first claimed the new bundle version number. this should kill the job and cause a retry.
    • bundle release is available with a simple GET https://quay.io/cnr/api/v1/packages/dgoodwin/hive-operator. latest appears to be the last one in the list.
  • download latest bundle yaml.
  • extract package data from latest bundle.
  • set new CSV "replaces" to the current CSV in package data for our channel.
  • extract all past CSVs from latest bundle and copy to new bundle dir.
  • extract any CRDs in latest bundle that are no longer present in new bundle. (presumably old CRDs must remain in the bundle for old CSVs (edge case))
  • publish new bundle dir, now with all past CSVs, each referencing correct "replaces".

Would any part of this functionality fit in operator-courier? Some kind of operator-courier update-bundle command?

@ecordell
Copy link
Member

ecordell commented Mar 7, 2019

Thanks for writing this up @dgoodwin!

Based on your requirements/environment, there might be several reasonable options:

  • As you outlined, generate a new operator bundle (CSV, CRDs) for your latest version
  • Put that bundle into a catalog by itself
  • Update staging with that small catalog
  • Remove previous staging version and install latest from the new catalog

or, if you have kubectl access to your staging cluster:

  • Same as above, but pull the hash of the currently running CSV and reference it in the replaces field
  • Don't need to remove the old version, publishing the new small catalog will update the staging operator

or, if you want keep a record of everything you've done:

  • Once you have a bundle, add it to a catalog and push the catalog to an image registry
  • During CI, pull down the previous version of that catalog image, and add a layer to it to add your latest version on top (you'll hit layer limits unless you do squashing as well, but if you use quay you can have quay do that for you)
  • Push the new catalog back to the registry and reference it in your staging cluster. OLM will update the operator and you'll have a catalog with old versions.
  • Depending on how frequently you're doing this, you may want to prune older versions. You could aggressively prune everything but (latest - 1) each time you build a catalog.

As your second comment suggests, you can do similar things by looping courier / appregistry into your workflow, but you will need marketplace running for that.

Would any part of this functionality fit in operator-courier? Some kind of operator-courier update-bundle command?

Maybe - it also overlaps a little with operator-sdk's olm tooling.

@dgoodwin
Copy link
Author

We ended up building up registry image that maps each version to the next in the CI job. Closing for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants