Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perma-diff when using self_link for network or subnetwork in google_container_cluster #1382

Closed
ghost opened this issue Apr 25, 2018 · 9 comments
Labels

Comments

@ghost
Copy link

ghost commented Apr 25, 2018

This issue was originally opened by @nikitashalnov as hashicorp/terraform#17919. It was migrated here as a result of the provider split. The original body of the issue is below.


Terraform Version

terraform -v
Terraform v0.11.7
+ provider.google v1.9.0
+ provider.random v1.2.0

Terraform Configuration Files

k8s-cluster.tf

-----------OUTPUT OMITTED------------------
# Network
resource "google_compute_network" "cluster-net" {
  name                    = "cluster-net"
  project                 = "${google_project.gke-proj.project_id}"
  auto_create_subnetworks = "false"
}

# Subnet for cluster nodes
resource "google_compute_subnetwork" "nodes-subnet" {
  name          = "nodes-subnet"
  project       = "${google_project.gke-proj.project_id}"
  ip_cidr_range = "10.101.0.0/24"
  network       = "${google_compute_network.cluster-net.self_link}"
  region        = "us-east4"

  secondary_ip_range {
    range_name    = "container-range-1"
    ip_cidr_range = "172.20.0.0/16"
  }

  secondary_ip_range {
    range_name    = "service-range-1"
    ip_cidr_range = "10.200.0.0/16"
  }
}

resource "google_container_cluster" "primary" {
  project            = "${google_project.gke-proj.project_id}"
  name               = "semrush-test"
  zone               = "us-east4-a"
  initial_node_count = 3

  network    = "${google_compute_network.cluster-net.self_link}"
  subnetwork = "${google_compute_subnetwork.nodes-subnet.self_link}"

  ip_allocation_policy {
    cluster_secondary_range_name  = "${google_compute_subnetwork.nodes-subnet.secondary_ip_range.0.range_name}"
    services_secondary_range_name = "${google_compute_subnetwork.nodes-subnet.secondary_ip_range.1.range_name}"
  }

}

Expected Behavior

terraform plan should show the plan, then terraform apply should this plan apply. After that running a command terraform plan on the SAME configuration files should show "No changes. Infrastructure is up-to-date."

Actual Behavior

terraform plan shows actual plan:

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  + google_container_cluster.kek
      id:                      <computed>
      additional_zones.#:      <computed>
      addons_config.#:         <computed>
      cluster_ipv4_cidr:       <computed>
      enable_kubernetes_alpha: "false"
      enable_legacy_abac:      "false"
      endpoint:                <computed>
      initial_node_count:      "3"
      instance_group_urls.#:   <computed>
      logging_service:         <computed>
      master_auth.#:           <computed>
      master_version:          <computed>
      monitoring_service:      <computed>
      name:                    "semrush-test"
      network:                 "cluster-net"
      network_policy.#:        <computed>
      node_config.#:           <computed>
      node_pool.#:             <computed>
      node_version:            <computed>
      private_cluster:         "false"
      project:                 "project-id"
      region:                  <computed>
      subnetwork:              "https://www.googleapis.com/compute/v1/projects/project-id/regions/us-east4/subnetworks/nodes-subnet"
      zone:                    "us-east4-a"


Plan: 1 to add, 0 to change, 0 to destroy.

------------------------------------------------------------------------

Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.

terraform apply applies the plan:

Terraform will perform the following actions:

  + google_container_cluster.kek
      id:                      <computed>
      additional_zones.#:      <computed>
      addons_config.#:         <computed>
      cluster_ipv4_cidr:       <computed>
      enable_kubernetes_alpha: "false"
      enable_legacy_abac:      "false"
      endpoint:                <computed>
      initial_node_count:      "3"
      instance_group_urls.#:   <computed>
      logging_service:         <computed>
      master_auth.#:           <computed>
      master_version:          <computed>
      monitoring_service:      <computed>
      name:                    "semrush-test"
      network:                 "cluster-net"
      network_policy.#:        <computed>
      node_config.#:           <computed>
      node_pool.#:             <computed>
      node_version:            <computed>
      private_cluster:         "false"
      project:                 "project-id"
      region:                  <computed>
      subnetwork:              "https://www.googleapis.com/compute/v1/projects/project-id/regions/us-east4/subnetworks/nodes-subnet"
      zone:                    "us-east4-a"


Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

google_container_cluster.kek: Creating...
  additional_zones.#:      "" => "<computed>"
  addons_config.#:         "" => "<computed>"
  cluster_ipv4_cidr:       "" => "<computed>"
  enable_kubernetes_alpha: "" => "false"
  enable_legacy_abac:      "" => "false"
  endpoint:                "" => "<computed>"
  initial_node_count:      "" => "3"
  instance_group_urls.#:   "" => "<computed>"
  logging_service:         "" => "<computed>"
  master_auth.#:           "" => "<computed>"
  master_version:          "" => "<computed>"
  monitoring_service:      "" => "<computed>"
  name:                    "" => "semrush-test"
  network:                 "" => "cluster-net"
  network_policy.#:        "" => "<computed>"
  node_config.#:           "" => "<computed>"
  node_pool.#:             "" => "<computed>"
  node_version:            "" => "<computed>"
  private_cluster:         "" => "false"
  project:                 "" => "project-id"
  region:                  "" => "<computed>"
  subnetwork:              "" => "https://www.googleapis.com/compute/v1/projects/project-id/regions/us-east4/subnetworks/nodes-subnet"
  zone:                    "" => "us-east4-a"
google_container_cluster.kek: Still creating... (10s elapsed)
google_container_cluster.kek: Still creating... (20s elapsed)
google_container_cluster.kek: Still creating... (30s elapsed)
google_container_cluster.kek: Still creating... (40s elapsed)
google_container_cluster.kek: Still creating... (50s elapsed)
google_container_cluster.kek: Still creating... (1m0s elapsed)
google_container_cluster.kek: Still creating... (1m10s elapsed)
google_container_cluster.kek: Still creating... (1m20s elapsed)
google_container_cluster.kek: Still creating... (1m30s elapsed)
google_container_cluster.kek: Still creating... (1m40s elapsed)
google_container_cluster.kek: Still creating... (1m50s elapsed)
google_container_cluster.kek: Still creating... (2m0s elapsed)
google_container_cluster.kek: Creation complete after 2m4s (ID: semrush-test)

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

terrafrom plan again on the SAME configuration shows that it must recreate cluster:

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement

Terraform will perform the following actions:

-/+ google_container_cluster.kek (new resource required)
      id:                      "semrush-test" => <computed> (forces new resource)
      additional_zones.#:      "0" => <computed>
      addons_config.#:         "1" => <computed>
      cluster_ipv4_cidr:       "10.24.0.0/14" => <computed>
      enable_kubernetes_alpha: "false" => "false"
      enable_legacy_abac:      "false" => "false"
      endpoint:                "35.186.170.157" => <computed>
      initial_node_count:      "3" => "3"
      instance_group_urls.#:   "1" => <computed>
      logging_service:         "logging.googleapis.com" => <computed>
      master_auth.#:           "1" => <computed>
      master_version:          "1.8.8-gke.0" => <computed>
      monitoring_service:      "monitoring.googleapis.com" => <computed>
      name:                    "semrush-test" => "semrush-test"
      network:                 "cluster-net" => "cluster-net"
      network_policy.#:        "0" => <computed>
      node_config.#:           "1" => <computed>
      node_pool.#:             "1" => <computed>
      node_version:            "1.8.8-gke.0" => <computed>
      private_cluster:         "false" => "false"
      project:                 "project-id" => "project-id"
      region:                  "" => <computed>
      subnetwork:              "nodes-subnet" => "https://www.googleapis.com/compute/v1/projects/project-id/regions/us-east4/subnetworks/nodes-subnet" (forces new resource)
      zone:                    "us-east4-a" => "us-east4-a"


Plan: 1 to add, 0 to change, 1 to destroy.

So terraform after creating GKE cluster misses some properties (id, subnet) and wants to recreate the cluster, because it thinks that something has changed.

Steps to Reproduce

  1. terraform init
  2. terraform apply
  3. terraform plan

Additional Context

During to experiments how to work around this bug it was detected that the cause is in referencing to properties network and subnetwork of gke cluster by self_links. Referencing by name fixes this wrong behavior. So this configuration works fine:

resource "google_container_cluster" "primary" {
  project            = "${google_project.gke-proj.project_id}"
  name               = "super-cluster-new"
  zone               = "us-east4-a"
  initial_node_count = 3

  network    = "${google_compute_network.cluster-net.name}"
  subnetwork = "${google_compute_subnetwork.nodes-subnet.name}"

  ip_allocation_policy {
    cluster_secondary_range_name  = "${google_compute_subnetwork.nodes-subnet.secondary_ip_range.0.range_name}"
    services_secondary_range_name = "${google_compute_subnetwork.nodes-subnet.secondary_ip_range.1.range_name}"
  }
}
@paddycarver paddycarver changed the title Terraform tries to recreate new just created kubernetes cluster (google provider) Perma-diff when using self_link for network or subnetwork in google_container_cluster Apr 27, 2018
@paddycarver
Copy link
Contributor

As you identified, it looks like we're storing the name of the network/subnetwork, but you're specifying a self_link. As a workaround, specifying the name should make this go away. A real solution would probably be to either validate this and throw an error if it's not just a name, accept a name or a self_link, or suppress the diff if a self_link is for a resource with the right name.

@rosbo
Copy link
Contributor

rosbo commented Apr 30, 2018

We already have a DiffSuppressFunc: compareSelfLinkOrResourceName. However, this function expects the value stored in state to be a self-link.

All the other APIs return a self-link for the subnetwork and network field except the GKE API.

In the Read method, we should use the ParseNetworkFieldValue and ParseSubnetworkFieldValue and save the self-link.

d.Set("network", cluster.Network) // wrong. Unlike other APIs. cluster.Network is the name-only.
d.Set("subnetwork", cluster.Subnetwork) // same

@paddycarver
Copy link
Contributor

I have feelings about changing this on people mid-major-version, because it's technically a breaking change, though one could argue it probably doesn't matter. But this would change the value returned for every interpolation of network or subnetwork, which seems like a value you'd want to interpolate?

I'm not saying let's leave it weird, but would it seems like we could work around it for the moment, and put it on the list for changes to make in 2.0.0 to address it at the root. /2¢

@rosbo rosbo added the v2 label May 1, 2018
@danawillow
Copy link
Contributor

Fixed as part of #1528

@m42u
Copy link

m42u commented Jun 7, 2018

Hi,

Not sure if I need to open a new issue or use this one...
I'm using v1.13.0 of the Terraform Google provider and I'm still faced to this problem:

google_service_account.node: Refreshing state... (ID: projects/xxx/serviceAc....iam.gserviceaccount.com)
data.google_compute_zones.available: Refreshing state...
data.google_compute_subnetwork.gke: Refreshing state...
data.google_compute_network.gke: Refreshing state...
google_container_cluster.gke: Refreshing state... (ID: test-gke-cluster)

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  ~ google_container_cluster.gke
      network:    "projects/xxx/global/networks/test-gke-network" => "https://www.googleapis.com/compute/v1/projects/xxx/global/networks/test-gke-network"
      subnetwork: "projects/xxx/regions/europe-west1/subnetworks/test-gke-network-subnet-1" => "test-gke-network-subnet-1"


Plan: 0 to add, 1 to change, 0 to destroy.

I'm using datasources to retrieve network and subnetwork resources but I don't think it's related.

My 2¢:

I saw you use relative links so it nether works with name nor self_link attributes and always trigger a change.

Thanks :)

@danawillow
Copy link
Contributor

Hey @m42u, I think the issue you're experiencing is the same root cause as #988 and #1566, which we haven't been able to figure out a fix for yet.

@duxbuse
Copy link

duxbuse commented Oct 29, 2018

I am having this issue with the following attributes:
id: "example-cluster" => (forces new resource)
node_pool.0.name: "primary-pool" => "default-pool" (forces new resource)

@paddycarver
Copy link
Contributor

Hi @duxbuse,

I'm not sure that's an issue related to this one. Do you mind opening a new issue and filling out the issue template? We'll need a bunch more information before we can help you with that, unfortunately.

@ghost
Copy link
Author

ghost commented Nov 16, 2018

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!

@ghost ghost locked and limited conversation to collaborators Nov 16, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants