Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade terraform-provider-ct to v0.3.0 to provide ignition schema v2.2. #275

Closed
wants to merge 1 commit into from

Conversation

cskarby
Copy link

@cskarby cskarby commented Aug 10, 2018

Warning

It is not safe to upgrade the terraform-provider-ct plugin between
minor versions on existing managed clusters. Minor version upgrades of this
plugin will re-provision your cluster because the Ignition config output has
changed.

However Typhoon explicitly encourages using a replacement strategy rather than inplace upgrades, which could make it acceptable to upgrade the terraform-provider-ct plugin with a proper warning in documentation.

Changes

Upgraded documentation. The user must upgrade the plugin manually by overwriting the old plugin.

Testing

terraform-provider-ct v0.3.0 is upgraded to using container-linux-config-transpiler v0.8.0. Ran the config transpiler on the raw cl templates, and got the following warnings. The errors in bare-metal is due to the transpiler not understanding the terraform template language (the plugin passes the result from a rendered template to the transpiler.)

./aws/container-linux/kubernetes/cl/controller.yaml.tmpl
warning at line 126, column 7
mode unspecified for file, defaulting to 0644

./aws/container-linux/kubernetes/workers/cl/worker.yaml.tmpl
warning at line 96, column 7
mode unspecified for file, defaulting to 0644

./bare-metal/container-linux/kubernetes/cl/controller.yaml.tmpl
error: yaml: unmarshal errors:
  line 167: cannot unmarshal !!str `${netwo...` into types.Networkd
Failed to parse config

./bare-metal/container-linux/kubernetes/cl/install.yaml.tmpl
./bare-metal/container-linux/kubernetes/cl/worker.yaml.tmpl
error: yaml: unmarshal errors:
  line 100: cannot unmarshal !!str `${netwo...` into types.Networkd
Failed to parse config

./digital-ocean/container-linux/kubernetes/cl/controller.yaml.tmpl
warning at line 132, column 7
mode unspecified for file, defaulting to 0644

./digital-ocean/container-linux/kubernetes/cl/worker.yaml.tmpl
warning at line 102, column 7
mode unspecified for file, defaulting to 0644

./google-cloud/container-linux/kubernetes/cl/controller.yaml.tmpl
warning at line 127, column 7
mode unspecified for file, defaulting to 0644

./google-cloud/container-linux/kubernetes/workers/cl/worker.yaml.tmpl
warning at line 97, column 7
mode unspecified for file, defaulting to 0644

Removing the terraform variable also renders the rest of the bare-metal cl templates without errors.

./bare-metal/container-linux/kubernetes/cl/controller.yaml.tmpl
warning at line 133, column 7
mode unspecified for file, defaulting to 0644

./bare-metal/container-linux/kubernetes/cl/worker.yaml.tmpl
warning at line 94, column 7
mode unspecified for file, defaulting to 0644

WARNING: It is not safe to upgrade the terraform-provider-ct plugin between
minor versions on existing managed clusters. Minor version upgrades of this
plugin will re-provision your cluster because the Ignition config output has
changed.
@dghubble
Copy link
Member

dghubble commented Aug 11, 2018

Thanks for the time you put into this. However, this isn't safe or complete.

I wrote a full warning for plugin versioning with terraform-provider-ct a while ago. Clusters created when using the v0.2.x plugins should continue using the v0.2.x plugins, those created using the v0.3.x plugins should continue using the v0.3.x version. terraform-provider-ct is an unofficial plugin. As such, Terraform treats it globally - each cluster isn't allowed to have its own version of the dependency (as you'd expect with say the aws provider). Use of blue/green deployment does not save you from this problem.

Here, you're updating a single cluster. That's fine. Its the simple case. But users who have existing clusters created using the v0.2.x plugin, if they update to v0.3.0, that means running terraform apply again for any of those clusters managed on the same system is a change. It violates an important invariant - if cluster's definition hasn't changed (i.e. no change to source=..., no change to fields) a terraform apply should be a NoOp. A global plugin version change causes a change to all managed clusters. And it means if they did have a legitimate change (e.g. add new workers) they can't apply it until that cluster has been recreated to use the new plugin. The potential for serious impact and incidents here is large both from the diff's and the way it prevents making legitimate changes until those are resolved. Its a non-starter in my mind.

Ultimately, this will require a much better migration story (here was the plan just for v0.2.0 to v0.2.1). Likely, in upcoming releases, bare-metal clusters will ignore post-creation alterations to user-data. Then enough time must pass that most existing clusters are on a version with that change. Then we'll be able to write docs on updating the global plugin without affecting all (sufficiently new) managed clusters. Alternately, Terraform allows custom plugins to be treated non-globally like official plugins, but that's pretty unlikely.

Regarding the templating issue - templates should be fully evaluated before being used by ct_config. I think this is an artifact of how you're testing, since it should be the case when using the module.

However: If you're a new user, you may feel free to start out using terraform-provider-ct v0.3.0 though. It doesn't contain much that's new, but its not harmful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants