Terraform can't handle GKE issue_client_certificate w/ K8S version 1.12 #3369

orkenstein · 2019-04-03T07:50:29Z

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment
If an issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to "hashibot", a community member has claimed the issue already.

Terraform Version

Terraform v0.11.13
+ provider.external v1.1.0
+ provider.google v2.3.0
+ provider.google-beta v2.3.0
+ provider.kubernetes v1.5.2
+ provider.null v2.1.0
+ provider.random v2.1.0

Affected Resource(s)

google_container_cluster

Terraform Configuration Files

resource "google_container_cluster" "dev" {

  name                     = "dev"
  min_master_version       = "1.12.6-gke.7"
  location                 = "${var.region}"
  remove_default_node_pool = true
  initial_node_count       = 1
  network                  = "${var.network}"

  addons_config {
    network_policy_config {
      disabled = true
    }
  }

  ip_allocation_policy {
    use_ip_aliases = true
  }

  master_auth {
    username = "${var.username}"
    password = "${var.password}"
  }

  provisioner "local-exec" {
    command = "gcloud container clusters get-credentials ${self.name} --region ${self.location}"
  }
}

Debug Output

https://gist.github.com/orkenstein/f68f6a437d2e5057e5d798508f851c66

Panic Output

Nope

Expected Behavior

Cluster should not be changed

Actual Behavior

-/+ module.gke.google_container_cluster.dev (new resource required)

Steps to Reproduce

Create a cluster resource
Terraform apply
Notice cluster changes requested

Important Factoids

References

The text was updated successfully, but these errors were encountered:

joaosousafranco · 2019-04-03T08:14:23Z

I am having exactly the same issue, i believe it is related with the following:

On 26th March there was a new release of terraform google cloud provider: https://github.com/terraform-providers/terraform-provider-google/releases/tag/v2.3.0
I took a look at the provider code and the following lines look suspect: https://github.com/terraform-providers/terraform-provider-google/blob/77c086de1c533e1ad4ea23a153d4266775c9ab2d/google/resource_container_cluster.go#L1726-L1730

Remz-Jay · 2019-04-03T13:50:26Z

Important detail seems to be that this only happens to recently created 1.12.6-gke.7 clusters.

Our preexisting 1.11.7-gke.12 clusters (one of which has afterwards been upgraded to said 1.12.6-gke.7) are (thankfully) not recreated.

orkenstein · 2019-04-03T14:23:47Z

Important detail seems to be that this only happens to recently created 1.12.6-gke.7 clusters.

Our preexisting 1.11.7-gke.12 clusters (one of which has afterwards been upgraded to said 1.12.6-gke.7) are (thankfully) not recreated.

I've tried to switch to latest, because of this: Azure/AKS#273

leg100 · 2019-04-03T16:35:20Z

This could well be related to #2183

rileykarson · 2019-04-03T17:26:31Z

@joaosousafranco: Those lines have been present for 10 months, so I don't think it's them.

GKE's administration API (eg the GCP API the Google provider uses, and not the Kubernetes API) has different behaviour when different Kubernetes versions are used at creation time. This is technically not a breaking change on their end, but super frustrating for API consumers like Terraform where we aren't enable to encode the rules of a whole other control plane's versioning system well.

K8S version 1.12 which was released recently was particularly bad about this, as many defaults were changed such as issuance of client certificates. That's the likely cause of the issue here; can you share what fields GKE is attempting to change @orkenstein?

orkenstein · 2019-04-03T20:43:46Z

@rileykarson these lines:

-/+ module.gke.google_container_cluster.dev (new resource required)
      id:                                                   "dev" => <computed> (forces new resource)
...
      master_auth.#:                                        "1" => "1"
      master_auth.0.client_certificate:                     "" => <computed>
      master_auth.0.client_certificate_config.#:            "1" => "0" (forces new resource)
      master_auth.0.client_key:                             <sensitive> => <computed> (attribute changed)
...

and these:

  ~ module.gke.google_container_cluster.dev
      network:                                                                      "projects/<project>/global/networks/default" => "default"

roidelapluie · 2019-04-16T20:41:44Z

Is there a workaround, besides

lifecycle {
  ignore_changes = ["master_auth", "network"]·
}

rileykarson · 2019-04-26T23:00:07Z

There are two workarounds for the issue where Terraform is causing a recreate. The recreate is a problem with how the provider a diff on master_auth.client_certificate_config. As identified above, defining this works (it causes Terraform to ignore diffs on master_auth and children entirely):

lifecycle {
  ignore_changes = ["master_auth"]
}

As well, if you've specified a master_auth block, you can explicitly specify a false issue_client_certificate, eg:

master_auth {
  client_certificate_config {
    issue_client_certificate = false
  }
}

Finally, if you use a min_master_version of 1.11.X or lower, Terraform should work as intended.

To provide more context on why this + #3240 are happening, as stated before, GKE published K8S 1.12 which changed the behaviour of the GKE API. Both issues are caused because they changed the issue_client_certificate field, and they're very technically not breaking changes- the old behaviour still works if you use an older K8S version. That means everything should work fine using 1.11.X or earlier.

This is a case that Terraform providers aren't great at handling. Provider schema typing is defined in code, and we need to make a code change + new provider release to solve this. A solution to one of this or #3240 needs to solve the other at the same time, or we're just going to have to make more similar changes possibly breaking users again so I'm consolidating both issues here.

When implementing this feature initially, because of how Terraform's diff engine behaves, we had to add shims at various levels to make this value appear correct for users at plan time. That included how we interpreted null values in the API.

The change in defaults means that the meaning of the null value has changed, and that's caused #3240- the provider is currently only able to send requests with a null value (previously an implicit enablement of client certs) or an explicit disablement.

In addition, the provider considers enabled values (issue_client_certificate = true equivalent to nulls right now, which... complicates things. That's part of the reason for this issue (#3369), as that shim was sufficient to solve block-level diffs. Previously, this assumption was true.

We'll attempt to massage Terraform so that clusters with either pre- or post- 1.12 act sensibly, and while preserving behaviour for 1.11 users. At first glance, it's likely we'll end up setting a Terraform-specific default to always enable client certs that we flip to disabled in version 3.0.0 of the provider.

Suggested here: hashicorp/terraform-provider-google#3369

rileykarson · 2019-05-31T18:42:42Z

This fix will be released in 2.8.0 around Tuesday.

matti · 2019-06-02T17:46:52Z

It would be better to not close issues before they are released..

ghost · 2019-06-21T20:34:04Z

2.8.0 does not seem to resolve the problem. Still getting the error

rileykarson · 2019-06-21T21:25:21Z

@james-knott can you file a new issue including a config and debug logs?

ghost · 2019-07-01T13:50:28Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!

ghost added the bug label Apr 3, 2019

rileykarson self-assigned this Apr 3, 2019

rileykarson changed the title ~~Terraform wants to update google_container_cluster on every apply~~ Terraform can't handle GKE issue_client_certificate w/ K8S version 1.12 Apr 26, 2019

rileykarson mentioned this issue Apr 26, 2019

google_container_cluster issue_client_certificate always false if created with min_master_version = 1.12.X #3240

Closed

jmound mentioned this issue May 10, 2019

Terraform can't handle GKE issue_client_certificate w/ K8S version 1.12 GoogleCloudPlatform/pci-gke-blueprint#41

Closed

alvin-huang mentioned this issue May 14, 2019

Fix k8s test hashicorp/go-discover#114

Closed

DanielJonesEB pushed a commit to EngineerBetter/scf-tutorial that referenced this issue May 20, 2019

attempt gke idempotency fix

0c992e1

Suggested here: hashicorp/terraform-provider-google#3369

ejschoen mentioned this issue May 21, 2019

Document workaround for #3369 in recommended usage examples for google_container_cluster #3672

Closed

jmound mentioned this issue May 24, 2019

Sets client_certificate_config so Terraform stops recreating the clusters GoogleCloudPlatform/pci-gke-blueprint#68

Merged

danawillow mentioned this issue May 24, 2019

Removed fields appear in state for resources but not data sources hashicorp/terraform#21347

Closed

This was referenced May 29, 2019

google_container_cluster tries to recreate cluster always when used in combination with google_container_node_pool #2115

Closed

Handle GKE cluster client cert settings correctly GoogleCloudPlatform/magic-modules#1855

Merged

lukehoban mentioned this issue May 31, 2019

up after empty refresh wants to replace a cluster pulumi/pulumi#2739

Closed

modular-magician closed this as completed in GoogleCloudPlatform/magic-modules#1855 May 31, 2019

rileykarson mentioned this issue Jun 14, 2019

Error recreating cluster when issue_client_certificate changed #1988

Closed

ghost locked and limited conversation to collaborators Jul 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Terraform can't handle GKE issue_client_certificate w/ K8S version 1.12 #3369

Terraform can't handle GKE issue_client_certificate w/ K8S version 1.12 #3369

orkenstein commented Apr 3, 2019

joaosousafranco commented Apr 3, 2019

Remz-Jay commented Apr 3, 2019

orkenstein commented Apr 3, 2019

leg100 commented Apr 3, 2019

rileykarson commented Apr 3, 2019 •

edited

Loading

orkenstein commented Apr 3, 2019

roidelapluie commented Apr 16, 2019

rileykarson commented Apr 26, 2019

rileykarson commented May 31, 2019

matti commented Jun 2, 2019

ghost commented Jun 21, 2019

rileykarson commented Jun 21, 2019

ghost commented Jul 1, 2019

Terraform can't handle GKE issue_client_certificate w/ K8S version 1.12 #3369

Terraform can't handle GKE issue_client_certificate w/ K8S version 1.12 #3369

Comments

orkenstein commented Apr 3, 2019

Community Note

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Panic Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

References

joaosousafranco commented Apr 3, 2019

Remz-Jay commented Apr 3, 2019

orkenstein commented Apr 3, 2019

leg100 commented Apr 3, 2019

rileykarson commented Apr 3, 2019 • edited Loading

orkenstein commented Apr 3, 2019

roidelapluie commented Apr 16, 2019

rileykarson commented Apr 26, 2019

rileykarson commented May 31, 2019

matti commented Jun 2, 2019

ghost commented Jun 21, 2019

rileykarson commented Jun 21, 2019

ghost commented Jul 1, 2019

rileykarson commented Apr 3, 2019 •

edited

Loading