Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Examples are not working on master branch, neither on last release (7.0.0), "network" field not populated ? #423

Closed
theobolo opened this issue Feb 6, 2020 · 17 comments · Fixed by #434
Assignees
Labels
bug Something isn't working P3 medium priority issues triaged Scoped and ready for work

Comments

@theobolo
Copy link

theobolo commented Feb 6, 2020

Hello,
I'm testing out terraform google kubernetes modules, especially the "beta-public-cluster", but i'm facing an issue about the "network" field.

I try something like this :

main.tf (examples/simple_regional_beta)

locals {
  cluster_type = "simple-regional-beta"
}

provider "google-beta" {
  version = "~> 3.3.0"
  region  = var.region
}

module "gke" {
  source                      = "terraform-google-modules/kubernetes-engine/google//modules/beta-public-cluster"
  version                     = "7.0.0"
  project_id                  = var.project_id
  name                        = "${local.cluster_type}-cluster${var.cluster_name_suffix}"
  regional                    = var.regional
  region                      = var.region
  zones                       = var.zones
  network                     = var.network
  subnetwork                  = var.subnetwork
  ip_range_pods               = var.ip_range_pods
  ip_range_services           = var.ip_range_services
  create_service_account      = var.compute_engine_service_account == "create"
  istio                       = var.istio
  cloudrun                    = var.cloudrun
  node_metadata               = var.node_metadata
  sandbox_enabled             = var.sandbox_enabled
  remove_default_node_pool    = var.remove_default_node_pool
  node_pools                  = var.node_pools
  database_encryption         = var.database_encryption
  enable_binary_authorization = var.enable_binary_authorization
  pod_security_policy_config  = var.pod_security_policy_config
  release_channel             = "UNSPECIFIED"
}

data "google_client_config" "default" {
}

variables.tf (examples/simple_regional_beta)

variable "project_id" {
  description = "The project ID to host the cluster in"
  default     = "fleeters-cloud"
}

variable "cluster_name_suffix" {
  description = "A suffix to append to the default cluster name"
  default     = ""
}

variable "region" {
  description = "The region to host the cluster in"
  default     = "europe-west1"
}

variable "network" {
  description = "The VPC network to host the cluster in"
  default     = "network-test"
}

variable "subnetwork" {
  description = "The subnetwork to host the cluster in"
  default     = "subnetwork-test"
}

variable "ip_range_pods" {
  description = "The secondary ip range to use for pods"
  default     = "pods-ip-range"
}

variable "ip_range_services" {
  description = "The secondary ip range to use for services"
  default     = "services-ip-range"
}

variable "compute_engine_service_account" {
  description = "Service account to associate to the nodes in the cluster"
  default     = "someServiceAccount"
}

variable "istio" {
  description = "Boolean to enable / disable Istio"
  default     = true
}

variable "cloudrun" {
  description = "Boolean to enable / disable CloudRun"
  default     = true
}

variable "node_metadata" {
  description = "Specifies how node metadata is exposed to the workload running on the node"
  default     = "SECURE"
  type        = string
}

variable "sandbox_enabled" {
  type        = bool
  description = "(Beta) Enable GKE Sandbox (Do not forget to set `image_type` = `COS_CONTAINERD` and `node_version` = `1.12.7-gke.17` or later to use it)."
  default     = false
}

variable "remove_default_node_pool" {
  type        = bool
  description = "Remove default node pool while setting up the cluster"
  default     = false
}

variable "node_pools" {
  type        = list(map(string))
  description = "List of maps containing node pools"

  default = [
    {
      name = "default-node-pool"
    },
  ]
}

variable "database_encryption" {
  description = "Application-layer Secrets Encryption settings. The object format is {state = string, key_name = string}. Valid values of state are: \"ENCRYPTED\"; \"DECRYPTED\". key_name is the name of a CloudKMS key."
  type        = list(object({ state = string, key_name = string }))
  default = [{
    state    = "DECRYPTED"
    key_name = ""
  }]
}

variable "enable_binary_authorization" {
  description = "Enable BinAuthZ Admission controller"
  default     = false
}

variable "pod_security_policy_config" {
  description = "enabled - Enable the PodSecurityPolicy controller for this cluster. If enabled, pods must be valid under a PodSecurityPolicy to be created."
  default = [{
    "enabled" = false
  }]
}

variable "zones" {
  type        = list(string)
  description = "The zones to host the cluster in (optional if regional cluster / required if zonal)"
  default     = []
}

variable "regional" {
  type        = bool
  description = "Whether is a regional cluster (zonal cluster if set false. WARNING: changing this after cluster creation is destructive!)"
  default     = true
}

When i'm doing : terraform plan

  # module.gke.google_container_cluster.primary will be created
  + resource "google_container_cluster" "primary" {
      + cluster_ipv4_cidr           = (known after apply)
      + default_max_pods_per_node   = 110
      + enable_binary_authorization = false
      + enable_intranode_visibility = false
      + enable_kubernetes_alpha     = false
      + enable_legacy_abac          = false
      + enable_shielded_nodes       = false
      + enable_tpu                  = false
      + endpoint                    = (known after apply)
      + id                          = (known after apply)
      + instance_group_urls         = (known after apply)
      + location                    = "europe-west1"
      + logging_service             = "logging.googleapis.com/kubernetes"
      + master_version              = (known after apply)
      + min_master_version          = "1.15.8-gke.2"
      + monitoring_service          = "monitoring.googleapis.com/kubernetes"
      + name                        = "simple-regional-beta-cluster"
      + network                     = "default" > **INCORRECT** 
      + node_locations              = (known after apply)
      + node_version                = (known after apply)
      + project                     = "fleeters-cloud"
      + remove_default_node_pool    = false
      + services_ipv4_cidr          = (known after apply)
      + subnetwork                  = (known after apply)   > **INCORRECT ?** (subnetwork setted in variable.tf)
      + tpu_ipv4_cidr_block         = (known after apply)
   ..........

      + ip_allocation_policy {
          + cluster_ipv4_cidr_block       = (known after apply)
          + cluster_secondary_range_name  = "pods-ip-range"  > **CORRECT** 
          + create_subnetwork             = (known after apply)
          + node_ipv4_cidr_block          = (known after apply)
          + services_ipv4_cidr_block      = (known after apply)
          + services_secondary_range_name = "services-ip-range"  > **CORRECT**
          + subnetwork_name               = (known after apply)
          + use_ip_aliases                = (known after apply)

You can see that the network field for the resource "google_container_cluster" "primary" stays at the default value.

By the way, ip_range_pods and ip_range_services names are setted correctly .....
I can't figure out if i'm missing something or not, i'm still new to IaC ecosystem...

When i'm running this example, my cluster never starts, because it tries to use default network for the cluster :

image

image

image

What have been missing guys, the module is not configured to create the network just by providing a name in the variable.tf file ?

@theobolo theobolo changed the title Examples not working on master branch, either on last release (7.0.0), "network" field not populated ? Examples are not working on master branch, neither on last release (7.0.0), "network" field not populated ? Feb 6, 2020
@morgante
Copy link
Contributor

morgante commented Feb 6, 2020

Hi there! This is rather odd behavior, but we do some validation of the input network here: https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/blob/master/modules/beta-public-cluster/networks.tf#L26

To debug things further, could you?

  1. Confirm if you're attempting to connect to a Shared VPC network
  2. Confirm that the subnetwork provided is in the region you selected
  3. Show the value/data of data.google_compute_network? terraform state show module.gke.data.google_compute_network.gke_network

@symbiont-jon-bogaty
Copy link

symbiont-jon-bogaty commented Feb 6, 2020

Same thing happened to me and it gets even odder. Setting up GKE with a network also provisioned via Google terraform module on the first run of terraform apply it attempted to use default for the network and failed because it couldn't find the pod and service CIDR ranges.

I ran the state show command and got:
data "google_compute_network" "gke_network" {}

As a sanity check I also ran:
terraform state show module.gcp-network.module.vpc.google_compute_network.network

# module.gcp-network.module.vpc.google_compute_network.network:
resource "google_compute_network" "network" {
    name                            = <<EXPECTED>>
}

The network did provision successfully.

I ran a second plan and confirmed the correct network name populated. Ran apply and it created the GKE cluster correctly.

@morgante
Copy link
Contributor

morgante commented Feb 6, 2020

@symbiont-jon-bogaty Were your network and GKE module provisioned in the same Terraform config? I'm wondering if that might be the issue.

@theobolo
Copy link
Author

theobolo commented Feb 6, 2020

Alright ! I think that i've misunderstood something, the "gke" module DO NOT create automatically Networks just by passing names to the module, i need to create them first by using something like this :

module "gcp-network" {
  source       = "terraform-google-modules/network/google"
  version      = "~> 2.0"
  project_id   = var.project_id
  network_name = var.network

  subnets = [
    {
      subnet_name   = var.subnetwork
      subnet_ip     = "10.0.0.0/17"
      subnet_region = var.region
    },
  ]

  secondary_ranges = {
    "${var.subnetwork}" = [
      {
        range_name    = var.ip_range_pods
        ip_cidr_range = "192.168.0.0/18"
      },
      {
        range_name    = var.ip_range_services
        ip_cidr_range = "192.168.64.0/18"
      },
    ]
  }
}

Right ?

I tried to create a Shared VPC and a subnetwork directly on Google Cloud dashboard, and finally, using terraform plan gives me something like this :

  # module.gke.google_container_cluster.primary will be created
  + resource "google_container_cluster" "primary" {
      + cluster_ipv4_cidr           = (known after apply)
      + default_max_pods_per_node   = 110
      + enable_binary_authorization = false
      + enable_intranode_visibility = false
      + enable_kubernetes_alpha     = false
      + enable_legacy_abac          = false
      + enable_shielded_nodes       = false
      + enable_tpu                  = false
      + endpoint                    = (known after apply)
      + id                          = (known after apply)
      + instance_group_urls         = (known after apply)
      + location                    = "europe-west1"
      + logging_service             = "logging.googleapis.com/kubernetes"
      + master_version              = (known after apply)
      + min_master_version          = "1.15.8-gke.2"
      + monitoring_service          = "monitoring.googleapis.com/kubernetes"
      + name                        = "simple-regional-beta-cluster"
      + network                     = "https://www.googleapis.com/compute/beta/projects/fleeters-cloud/global/networks/test-network"
      + node_locations              = [
          + "europe-west1-b",
          + "europe-west1-c",
          + "europe-west1-d",
        ]
      + node_version                = (known after apply)
      + project                     = "fleeters-cloud"
      + remove_default_node_pool    = false
      + services_ipv4_cidr          = (known after apply)
      + subnetwork                  = "https://www.googleapis.com/compute/beta/projects/fleeters-cloud/regions/europe-west1/subnetworks/test-subnetwork"
      + tpu_ipv4_cidr_block         = (known after apply)

@morgante So to answer your question :

  • I did create a Shared VPC directly into GCP Dashboard, now it shows up on terraform plan
  • I did create the subnetwork to the right region, and now it also shows up on terraform plan
  • BTW this command output is still empty :
# module.gke.data.google_compute_network.gke_network:
data "google_compute_network" "gke_network" {}

If i'm right for provisionning a functionnal GKE Cluster i need to do something like this :

locals {
  cluster_type = "simple-regional-beta"
}

provider "google-beta" {
  version = "~> 3.3.0"
  region  = var.region
}

module "gcp-network" {
  source       = "terraform-google-modules/network/google"
  version      = "~> 2.0"
  project_id   = var.project_id
  network_name = var.network

  subnets = [
    {
      subnet_name   = var.subnetwork
      subnet_ip     = "10.0.0.0/17"
      subnet_region = var.region
    },
  ]

  secondary_ranges = {
    "${var.subnetwork}" = [
      {
        range_name    = var.ip_range_pods
        ip_cidr_range = "192.168.0.0/18"
      },
      {
        range_name    = var.ip_range_services
        ip_cidr_range = "192.168.64.0/18"
      },
    ]
  }
}

module "gke" {
  source                      = "terraform-google-modules/kubernetes-engine/google//modules/beta-public-cluster"
  version                     = "7.0.0"
  project_id                  = var.project_id
  name                        = "${local.cluster_type}-cluster${var.cluster_name_suffix}"
  regional                    = var.regional
  region                      = var.region
  zones                       = var.zones
  network                     = var.network
  subnetwork                  = var.subnetwork
  ip_range_pods               = var.ip_range_pods
  ip_range_services           = var.ip_range_services
  create_service_account      = var.compute_engine_service_account == "create"
  istio                       = var.istio
  cloudrun                    = var.cloudrun
  node_metadata               = var.node_metadata
  sandbox_enabled             = var.sandbox_enabled
  remove_default_node_pool    = var.remove_default_node_pool
  node_pools                  = var.node_pools
  database_encryption         = var.database_encryption
  enable_binary_authorization = var.enable_binary_authorization
  pod_security_policy_config  = var.pod_security_policy_config
  release_channel             = "UNSPECIFIED"
}

data "google_client_config" "default" {
} 

Is that correct ?

@morgante
Copy link
Contributor

morgante commented Feb 6, 2020

@theobolo Ah yes I see your confusion! You are correct: you need to separately create the network outside the GKE module, then pass the name in.

One suggested change for your configuration to ensure the proper dependency order:

module "gke" {
  source                      = "terraform-google-modules/kubernetes-engine/google//modules/beta-public-cluster"
  version                     = "7.0.0"
  network                     = module.gcp-network.network_name
  subnetwork                  = module.gcp-network.subnets_names[0]
}

@theobolo
Copy link
Author

theobolo commented Feb 6, 2020

@morgante Alright, sorry for that... :D I was very confused about the examples ...

I'm currently trying a terraform apply with your recommandations, but it seems that the same thing is happening since the network is created in parallel with the gke cluster, do you know how to handle this, in one terraform apply command ?

@morgante
Copy link
Contributor

morgante commented Feb 6, 2020

Hmm, it should depend on the input. What does terraform plan look like for you?

@theobolo
Copy link
Author

theobolo commented Feb 6, 2020

With that file :

locals {
  cluster_type = "simple-regional-beta"
}

provider "google-beta" {
  version = "~> 3.3.0"
  region  = var.region
}

module "gcp-network" {
  source       = "terraform-google-modules/network/google"
  version      = "~> 2.0"
  project_id   = var.project_id
  network_name = var.network

  subnets = [
    {
      subnet_name   = var.subnetwork
      subnet_ip     = "10.0.0.0/17"
      subnet_region = var.region
    },
  ]

  secondary_ranges = {
    "${var.subnetwork}" = [
      {
        range_name    = var.ip_range_pods
        ip_cidr_range = "192.168.0.0/18"
      },
      {
        range_name    = var.ip_range_services
        ip_cidr_range = "192.168.64.0/18"
      },
    ]
  }
}

module "gke" {
  source                      = "terraform-google-modules/kubernetes-engine/google//modules/beta-public-cluster"
  version                     = "7.0.0"
  project_id                  = var.project_id
  name                        = "${local.cluster_type}-cluster${var.cluster_name_suffix}"
  regional                    = var.regional
  region                      = var.region
  zones                       = var.zones
  network                     = module.gcp-network.network_name
  subnetwork                  = module.gcp-network.subnets_names[0]
  ip_range_pods               = var.ip_range_pods
  ip_range_services           = var.ip_range_services
  create_service_account      = var.compute_engine_service_account == "create"
  istio                       = var.istio
  cloudrun                    = var.cloudrun
  node_metadata               = var.node_metadata
  sandbox_enabled             = var.sandbox_enabled
  remove_default_node_pool    = var.remove_default_node_pool
  node_pools                  = var.node_pools
  database_encryption         = var.database_encryption
  enable_binary_authorization = var.enable_binary_authorization
  pod_security_policy_config  = var.pod_security_policy_config
  release_channel             = "UNSPECIFIED"
}

data "google_client_config" "default" {
}

The first terraform plan looks like the first one posted, with network = default.

BTW i did a terraform apply, witch created the network, but the cluster ended in the same loop...
Then i did a terraform plan again, and of course, at this moment my plan was populated with the right networks. But the checking command you gave to me, still returns an "empty" object :

terraform state show module.gke.data.google_compute_network.gke_network

# module.gke.data.google_compute_network.gke_network:
data "google_compute_network" "gke_network" {}

terraform state show module.gke.data.google_compute_network.gke_network

# module.gcp-network.module.vpc.google_compute_network.network:
resource "google_compute_network" "network" {
    auto_create_subnetworks         = false
    delete_default_routes_on_create = false
    id                              = "projects/fleeters-cloud/global/networks/test-network"
    name                            = "test-network"
    project                         = "fleeters-cloud"
    routing_mode                    = "GLOBAL"
    self_link                       = "https://www.googleapis.com/compute/v1/projects/fleeters-cloud/global/networks/test-network"
}

So i can imagine that if i do a terraform apply again, my cluster should be correctly configured, but that's not ideal :/

edit: I did try a terraform apply again, and it says :

PS C:\Users\Ghosty\Documents\Pitch-n-Rise\fleeters-cloud-infra\infra\gke> terraform apply -auto-approve
module.gke.random_string.cluster_service_account_suffix: Refreshing state... [id=stln]
data.google_client_config.default: Refreshing state...
module.gcp-network.module.vpc.google_compute_network.network: Refreshing state... [id=projects/fleeters-cloud/global/networks/test-network]
module.gke.data.google_container_engine_versions.region: Refreshing state...
module.gke.data.google_compute_zones.available: Refreshing state...
module.gke.data.google_client_config.default: Refreshing state...
module.gke.data.google_compute_network.gke_network: Refreshing state...
module.gcp-network.module.subnets.google_compute_subnetwork.subnetwork["europe-west1/test-subnetwork"]: Refreshing state... [id=projects/fleeters-cloud/regions/europe-west1/subnetworks/test-subnetwork]
module.gke.data.google_container_engine_versions.zone: Refreshing state...
module.gke.random_shuffle.available_zones: Refreshing state... [id=-]
module.gke.data.google_compute_subnetwork.gke_subnetwork: Refreshing state...
module.gke.google_container_cluster.primary: Creating...

Warning: External references from destroy provisioners are deprecated

  on .terraform\modules\gke\terraform-google-modules-terraform-google-kubernetes-engine-7be707a\modules\beta-public-cluster\cluster.tf line 369, in resource "null_resource" "wait_for_cluster":
 369:     command = "${path.module}/scripts/wait-for-cluster.sh ${var.project_id} ${var.name}"

Destroy-time provisioners and their connection configurations may only
reference attributes of the related resource, via 'self', 'count.index', or
'each.key'.

References to other resources during the destroy phase can cause dependency
cycles and interact poorly with create_before_destroy.

(and one more similar warning elsewhere)

Error: googleapi: Error 409: Already exists: projects/fleeters-cloud/locations/europe-west1/clusters/simple-regional-beta-cluster., alreadyExists


  on .terraform\modules\gke\terraform-google-modules-terraform-google-kubernetes-engine-7be707a\modules\beta-public-cluster\cluster.tf line 22, in resource "google_container_cluster" "primary":
  22: resource "google_container_cluster" "primary" {

@morgante
Copy link
Contributor

morgante commented Feb 6, 2020

That is unfortunate. I suspect passing the network through the data is causing issues, especially since I've noticed dependency order issues with data sources elsewhere in TF 0.12.

We originally added these data sources in #16 but I'm tempted to go back to path formatting. It's ugly but simpler.

@theobolo
Copy link
Author

theobolo commented Feb 6, 2020

I did noticed an issue with someone talking about something similar : #314

He seems to have figured it out, but he didn't said how :D

@morgante
Copy link
Contributor

morgante commented Feb 6, 2020

@pratikmallya Since you originally authored #16, would you foresee any issues with switching back to basic path routing? I think it's the simplest fix for the dependency ordering we're encountering.

@gkowalski-google
Copy link

@morgante I'm hitting this issue as well, is there a workaround for now?

@pratikmallya
Copy link
Contributor

pratikmallya commented Feb 12, 2020

@morgante fine to switch back. Apologies, didn't see the notification earlier.

@morgante morgante added bug Something isn't working P3 medium priority issues triaged Scoped and ready for work labels Feb 12, 2020
@gkowalski-google
Copy link

@theobolo I was hitting this issue too while trying to use GKE master version 1.15.8-gke.3, but I am no longer seeing it while using 1.14.10-gke.17. YMMV.

"Interesting" variables at play:

  • Google provider ~> 3.7.0
  • Google beta provider ~> 3.7.0
  • Helm provider ~> 0.10.4
  • Kubernetes provider ~> 1.10.0
  • GKE master version 1.14.10-gke.17
  • GKE Terraform module ~> 7.2.0
  • GCP network Terraform module ~> 2.1.0

@morgante
Copy link
Contributor

This should (hopefully) be fixed in the latest 7.3.0 release. Can you give it a shot and see if you're still having issues?

@theobolo
Copy link
Author

@morgante Everything works perfectly now :) Thanks a lot !

@mikamboo
Copy link

mikamboo commented Feb 15, 2021

Hello guys thank for this issue, it cured me from a lot of headache.

In my case i chooe to create the vpc and subnets using vanilla google_compute_network resource in place of terraform-google-modules/network/google module. I found it more clear, it works fine !

resource "google_compute_network" "vpc" {
  name                    = "${var.cluster_name}-vpc"
  auto_create_subnetworks = "false"
}

# Subnet pods
resource "google_compute_subnetwork" "subnet" {
  name               = "${var.cluster_name}-subnet"
  region             = var.region
  network            = google_compute_network.vpc.name
  ip_cidr_range      = "10.10.0.0/18"

  secondary_ip_range = [
    {
      range_name    = var.ip_range_pods_name
      ip_cidr_range = "192.168.0.0/18"
    },
    {
      range_name    = var.ip_range_services_name
      ip_cidr_range = "192.168.64.0/18"
    },
  ]
}

My gke

module "gke" {
  # ...
  network                    = google_compute_network.vpc.name
  subnetwork                 = google_compute_subnetwork.subnet.name
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P3 medium priority issues triaged Scoped and ready for work
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants