Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terraform can't handle GKE issue_client_certificate w/ K8S version 1.12 #3369

Closed
orkenstein opened this issue Apr 3, 2019 · 13 comments · Fixed by GoogleCloudPlatform/magic-modules#1855
Assignees
Labels

Comments

@orkenstein
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
  • If an issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to "hashibot", a community member has claimed the issue already.

Terraform Version

Terraform v0.11.13
+ provider.external v1.1.0
+ provider.google v2.3.0
+ provider.google-beta v2.3.0
+ provider.kubernetes v1.5.2
+ provider.null v2.1.0
+ provider.random v2.1.0

Affected Resource(s)

google_container_cluster

Terraform Configuration Files

resource "google_container_cluster" "dev" {

  name                     = "dev"
  min_master_version       = "1.12.6-gke.7"
  location                 = "${var.region}"
  remove_default_node_pool = true
  initial_node_count       = 1
  network                  = "${var.network}"

  addons_config {
    network_policy_config {
      disabled = true
    }
  }

  ip_allocation_policy {
    use_ip_aliases = true
  }

  master_auth {
    username = "${var.username}"
    password = "${var.password}"
  }

  provisioner "local-exec" {
    command = "gcloud container clusters get-credentials ${self.name} --region ${self.location}"
  }
}

Debug Output

https://gist.github.com/orkenstein/f68f6a437d2e5057e5d798508f851c66

Panic Output

Nope

Expected Behavior

Cluster should not be changed

Actual Behavior

-/+ module.gke.google_container_cluster.dev (new resource required)

Steps to Reproduce

  1. Create a cluster resource
  2. Terraform apply
  3. Notice cluster changes requested

Important Factoids

References

@ghost ghost added the bug label Apr 3, 2019
@joaosousafranco
Copy link

I am having exactly the same issue, i believe it is related with the following:

@Remz-Jay
Copy link

Remz-Jay commented Apr 3, 2019

Important detail seems to be that this only happens to recently created 1.12.6-gke.7 clusters.

Our preexisting 1.11.7-gke.12 clusters (one of which has afterwards been upgraded to said 1.12.6-gke.7) are (thankfully) not recreated.

@orkenstein
Copy link
Author

Important detail seems to be that this only happens to recently created 1.12.6-gke.7 clusters.

Our preexisting 1.11.7-gke.12 clusters (one of which has afterwards been upgraded to said 1.12.6-gke.7) are (thankfully) not recreated.

I've tried to switch to latest, because of this: Azure/AKS#273

@rileykarson rileykarson self-assigned this Apr 3, 2019
@leg100
Copy link

leg100 commented Apr 3, 2019

This could well be related to #2183

@rileykarson
Copy link
Collaborator

rileykarson commented Apr 3, 2019

@joaosousafranco: Those lines have been present for 10 months, so I don't think it's them.

GKE's administration API (eg the GCP API the Google provider uses, and not the Kubernetes API) has different behaviour when different Kubernetes versions are used at creation time. This is technically not a breaking change on their end, but super frustrating for API consumers like Terraform where we aren't enable to encode the rules of a whole other control plane's versioning system well.

K8S version 1.12 which was released recently was particularly bad about this, as many defaults were changed such as issuance of client certificates. That's the likely cause of the issue here; can you share what fields GKE is attempting to change @orkenstein?

@orkenstein
Copy link
Author

@rileykarson these lines:

-/+ module.gke.google_container_cluster.dev (new resource required)
      id:                                                   "dev" => <computed> (forces new resource)
...
      master_auth.#:                                        "1" => "1"
      master_auth.0.client_certificate:                     "" => <computed>
      master_auth.0.client_certificate_config.#:            "1" => "0" (forces new resource)
      master_auth.0.client_key:                             <sensitive> => <computed> (attribute changed)
...

and these:

  ~ module.gke.google_container_cluster.dev
      network:                                                                      "projects/<project>/global/networks/default" => "default"

@roidelapluie
Copy link

Is there a workaround, besides

lifecycle {
  ignore_changes = ["master_auth", "network"]·
}

@rileykarson rileykarson changed the title Terraform wants to update google_container_cluster on every apply Terraform can't handle GKE issue_client_certificate w/ K8S version 1.12 Apr 26, 2019
@rileykarson
Copy link
Collaborator

There are two workarounds for the issue where Terraform is causing a recreate. The recreate is a problem with how the provider a diff on master_auth.client_certificate_config. As identified above, defining this works (it causes Terraform to ignore diffs on master_auth and children entirely):

lifecycle {
  ignore_changes = ["master_auth"]
}

As well, if you've specified a master_auth block, you can explicitly specify a false issue_client_certificate, eg:

master_auth {
  client_certificate_config {
    issue_client_certificate = false
  }
}

Finally, if you use a min_master_version of 1.11.X or lower, Terraform should work as intended.


To provide more context on why this + #3240 are happening, as stated before, GKE published K8S 1.12 which changed the behaviour of the GKE API. Both issues are caused because they changed the issue_client_certificate field, and they're very technically not breaking changes- the old behaviour still works if you use an older K8S version. That means everything should work fine using 1.11.X or earlier.

This is a case that Terraform providers aren't great at handling. Provider schema typing is defined in code, and we need to make a code change + new provider release to solve this. A solution to one of this or #3240 needs to solve the other at the same time, or we're just going to have to make more similar changes possibly breaking users again so I'm consolidating both issues here.

When implementing this feature initially, because of how Terraform's diff engine behaves, we had to add shims at various levels to make this value appear correct for users at plan time. That included how we interpreted null values in the API.

The change in defaults means that the meaning of the null value has changed, and that's caused #3240- the provider is currently only able to send requests with a null value (previously an implicit enablement of client certs) or an explicit disablement.

In addition, the provider considers enabled values (issue_client_certificate = true equivalent to nulls right now, which... complicates things. That's part of the reason for this issue (#3369), as that shim was sufficient to solve block-level diffs. Previously, this assumption was true.

We'll attempt to massage Terraform so that clusters with either pre- or post- 1.12 act sensibly, and while preserving behaviour for 1.11 users. At first glance, it's likely we'll end up setting a Terraform-specific default to always enable client certs that we flip to disabled in version 3.0.0 of the provider.

@rileykarson
Copy link
Collaborator

This fix will be released in 2.8.0 around Tuesday.

@matti
Copy link

matti commented Jun 2, 2019

It would be better to not close issues before they are released..

@ghost
Copy link

ghost commented Jun 21, 2019

2.8.0 does not seem to resolve the problem. Still getting the error

@rileykarson
Copy link
Collaborator

@james-knott can you file a new issue including a config and debug logs?

@ghost
Copy link

ghost commented Jul 1, 2019

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!

@ghost ghost locked and limited conversation to collaborators Jul 1, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants