Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/tfvars: Pull from []Machine instead of InstallConfig #792

Merged
merged 1 commit into from
Feb 7, 2019

Conversation

wking
Copy link
Member

@wking wking commented Dec 5, 2018

The previous implementation pulled from the install-config, but that missed downstream changes like AWS instance type defaults being applied in pkg/asset/machines. With this commit, we pull that information from the cluster-API types, since they're the last touch point for that data. The remaining information still comes from the InstallConfig, but I've split it into explicit arguments to avoid confusion about where data is coming from when InstallConfig's machine pools, etc. overlap with clusterapi.Machine fields.

This is a WIP because there are some FIXMEs around cluster-API support for some of the settings we're currently configuring in InstallConfig machine pools.

Fixes #671.

@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 5, 2018
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 5, 2018
pkg/tfvars/aws/aws.go Outdated Show resolved Hide resolved
pkg/tfvars/tfvars.go Outdated Show resolved Hide resolved
@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 17, 2018
@wking wking force-pushed the tfvars-from-machines branch from 270da62 to 931b281 Compare December 17, 2018 09:13
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 17, 2018
@wking
Copy link
Member Author

wking commented Dec 17, 2018

Rebased onto master with 270da62 -> 931b281.

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 24, 2018
@wking wking force-pushed the tfvars-from-machines branch from 931b281 to 43284f7 Compare January 3, 2019 08:35
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 3, 2019
@ashcrow
Copy link
Member

ashcrow commented Jan 4, 2019

/retest

@crawford crawford added this to the Freeze milestone Jan 10, 2019
@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 10, 2019
@wking wking removed this from the Freeze milestone Jan 12, 2019
@wking wking force-pushed the tfvars-from-machines branch from 43284f7 to 3230622 Compare January 14, 2019 19:35
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 14, 2019
@wking wking force-pushed the tfvars-from-machines branch from 6d449c5 to 6bfc172 Compare February 7, 2019 05:42
@openshift-ci-robot openshift-ci-robot removed lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Feb 7, 2019
The previous implementation pulled from the install-config, but that
missed downstream changes like AWS instance type defaults being
applied in pkg/asset/machines.  With this commit, we pull that
information from the cluster-API types, since they're the last touch
point for that data.  Some remaining information still comes from the
InstallConfig, but I've split it into explicit arguments to avoid
confusion about where data is coming from when InstallConfig's machine
pools, etc. overlap with clusterapi.Machine fields.

Unfortunately, Master.Machines is not loadable.  This commit fixes the
"Go defaults applied to Master.Machines do not affect Terraform" issue
(which is good), but it will not fix the "I ran create manifests and
edited openshift/99_openshift-cluster-api_master-machines.yaml"
because that is loaded back in downstream of the Master asset.  We'll
address that in follow-up work.

This commit also splits the platform-specific Terraform variable
generation into separate functions generating separate files.  I've
used *.auto.tfvars filenames because Terraform loads those
automatically from the current directory [1].  But that only helps
folks who are trying to run our generated Terraform by hand; as
described in d19cad5 (destroy/bootstrap: explicit pass
`disable-bootstrap.tfvars` for libvirt, 2018-12-13, openshift#900), the
installer runs outside the Terraform directory and needs to pass this
through to Terraform explicitly.

The code copying the platform-specific Terraform variables file down
into the temporary directory for bootstrap deletion has an IsNotExist
check.  At the moment, all of our platforms except for 'none' generate
a platform-specific Terraform variables file.  But we're starting to
move towards "treat platforms you don't recognize as no-ops" in most
locations (e.g. [2]), so we have the check to make "I guess there
wasn't a platform-specific Terraform variables file for this platform"
non-fatal.

I'd prefer if FileList could be an internal property (.files?), but we
currently require public fields for reloading from disk during
multiple-invocation creation [3].

[1]: https://www.terraform.io/docs/configuration/variables.html#variable-files
[2]: https://github.com/openshift/installer/pull/1112/files#diff-6532666297c6115f7774f93ebab396c4R156
[3]: openshift#792 (comment)
@wking wking force-pushed the tfvars-from-machines branch from 6bfc172 to 3326b6b Compare February 7, 2019 05:46
@wking
Copy link
Member Author

wking commented Feb 7, 2019

Rebased around #1175 with 6d449c5 -> 3326b6b. I'm looking forward to being able to drop MachinesDeprecated. @abhinavdahiya, can you take another look?

@abhinavdahiya
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 7, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [abhinavdahiya,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@wking
Copy link
Member Author

wking commented Feb 7, 2019

e2e-aws:

Flaky tests:

[Conformance][Area:Networking][Feature:Router] The HAProxy router should serve the correct routes when scoped to a single namespace and label set [Suite:openshift/conformance/parallel/minimal]

Failing tests:

[Conformance][Area:Networking][Feature:Router] The HAProxy router should serve routes that were created from an ingress [Suite:openshift/conformance/parallel/minimal]

/retest

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Feb 7, 2019

@wking: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-openstack 3326b6b link /test e2e-openstack
ci/prow/e2e-libvirt 3326b6b link /test e2e-libvirt

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 193a3f2 into openshift:master Feb 7, 2019
wking added a commit to wking/openshift-installer that referenced this pull request Feb 11, 2019
Since 3326b6b (pkg/tfvars: Pull from []Machine instead of
InstallConfig, 2018-11-28, openshift#792), we've been consuming structured
master configurations to generate Terraform variables.  But before
this commit, the Openshift asset was the WriteableAsset responsible
for the master configs, giving you the following issue:

1. openshift-install create manifests
  i. Master asset populated with structured data
  ii. Openshift asset pulls in Master, flattens to YAML, and writes to
      disk.

2. openshift-install create cluster
  i. Openshift asset reads master YAML from disk.
  ii. TerraformVariables asset pulls in Master, but it's empty.
  iii. Panic [1].

With this commit, responsibility for reading and writing master
manifests to disk gets shifted to the Master asset itself.

I've dropped the structured Master properties (Machines and
MachineDeprecated) in favor of new methods on the asset.  There are
three possible paths towards recovering the asset that we feed in to
TerraformVariables:

a. After rendering it with Generate.  This information had been
   structured before this commit, so no problems here.
b. After reading from the disk.  This information could be
   unmarshalled in Master.Load(), but...
c. After loading it from the state file.  There are no hooks for
   custom unmarshalling in this pathway, so while the generic YAML
   unmarshal would give us a structured machine object, it would not
   unmarshal the RawExtension that holds the provider spec.

The lack of custom-unmarshal hooks for (c) means there's no reliable
way to use Master properties to feed TerraformVariables structured
provider information.  With this commit, we use Master methods
instead.  That's just as efficient for the (b) and (c) cases, and
while it's an uneccessary (de)serialize cycle for (a), that's not too
expensive.

Unpacking the YAML data into the appropriate structures is a bit
hairy.  I'm not familiar with this code though, maybe there's a better
way.  It will help once we complete the shift to the OpenShift
machine-API types started in cf60daa (Pivot aws from cluster.k8s.io
into machine.openshift.io, 2019-02-01, openshift#1175).

I've also taken the opportunity to drop the list.  It saves us a few
lines of code for (de)serializing, and there's no upside to collecting
the Machine configs together in a single file.

The "glob, but not the master files" logic in the Openshift loader is
also a bit ugly.  Moving forward, I expect us to push the remaining
Openshift assets out into their own assets as well, which would allow
us to tighten down on the wildcard there.

There's also a bit of dancing to ensure our Machine filenames are
ordered by increasing index.  I'm padding the filenames with %02d (for
example) to get ...-01.yaml, etc. so they come back in the right order
on load (filepath.Glob sorts its output [2]).  To avoid hard-coding a
pad size, I format the largest index, measure its length, and use that
length to construct padFormat.

[1]: openshift#1205
[2]: https://github.com/golang/go/blob/go1.11.5/src/path/filepath/match.go#L325
@wking wking deleted the tfvars-from-machines branch February 13, 2019 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants