Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable auto-upgrade in beta clusters with a release channel #682

Conversation

dpetersen
Copy link
Contributor

I have been testing some changes with zonal clusters, but receive this error:

Error: error creating NodePool: googleapi: Error 400: Auto_upgrade cannot be false when release_channel REGULAR is set., badRequest

The node pools I define do have autoupgrade enabled, but I think the default node pool (which I am deleting with remove_default_node_pool anyway) is the problem node pool. Regional clusters already have auto-upgrade enabled by default, so I'm guessing nobody has used the specific combination of a zonal cluster and release channels with this module yet.

This is noted in the documentation as a disallowed configuration:

Note: Node auto-upgrade and auto-repair are enabled and cannot be disabled when using release channels.

It's complicated slightly because release channels still only exist in the beta Google Terraform provider, as far as I can tell. I have left the non-beta configuration untouched and added an if beta_cluster block for my changes.

Without this setting, a cluster with a default node pool will fail when
a release channel is specified. The API responds with:

Error: error creating NodePool: googleapi: Error 400: Auto_upgrade cannot be false when release_channel REGULAR is set., badRequest

This is noted in the documentation, that use of a release channel
requires auto-upgrade and auto-repair to be enabled:

https://cloud.google.com/kubernetes-engine/docs/concepts/release-channels#new-cluster
@comment-bot-dev
Copy link

comment-bot-dev commented Sep 21, 2020

Thanks for the PR! 🚀
✅ Lint checks have passed.

@morgante
Copy link
Contributor

Thanks for the contribution, I'm not quite sure what's happening with tests. @bharathkkb Any chance this error is connected to your bundling work?

terraform_validate ./modules/safer-cluster 
Error verifying checksum for provider "kubernetes"
The checksum for provider distribution from the Terraform Registry
did not match the source. This may mean that the distributed files
were changed after this version was released to the Registry.
Error: 
unable to verify checksum

@bharathkkb
Copy link
Member

@morgante this is actually what bundling aims to resolve. I will finish that up this week. This happens once in a while because we download all providers once per example, modules and fixtures.

@bharathkkb
Copy link
Member

/gcbrun

@bharathkkb
Copy link
Member

Hi @dpetersen thanks for the PR!
You will need to run make build to propagate these changes from /autogen to the modules.

The node pools I define do have autoupgrade enabled

At what point of cluster provisioning does this error occur, i.e during creation of google_container_cluster.primary or google_container_node_pool.pools. This can help us narrow down.

@dpetersen
Copy link
Contributor Author

Yes, I have a separate branch where I am checking in the result of make build. I have a couple of PRs that I am combining at the moment. Should I be running make build and committing that in my PR, or do you folks do that when you build the module releases?

Here's a minimal example of what fails, and how:

module "prtest" {                                                                            
  source  = "terraform-google-modules/kubernetes-engine/google//modules/beta-private-cluster"
  version = "11.1.0"                                                                         
                                                                                             
  project_id             = module.gcp-project.project_id                                     
  name                   = "prtest"                                                          
  network                = module.gke-network.vpc_network.name                               
  subnetwork             = module.gke-network.vpc_subnetworks["prtest"].subnetwork.name      
  master_ipv4_cidr_block = module.gke-network.vpc_subnetworks["prtest"].master_cidr_block    
  ip_range_pods          = module.gke-network.vpc_subnetworks["prtest"].pods_range_name      
  ip_range_services      = module.gke-network.vpc_subnetworks["prtest"].services_range_name  
                                                                                             
  regional = false                                                                           
  region   = "us-central1"                                                                   
  zones    = ["us-central1-a", "us-central1-b", "us-central1-c"]                             
                                                                                             
  release_channel = "REGULAR"                                                                
}                                                                                            `
module.prtest.google_container_cluster.primary: Creating...
module.prtest.google_container_cluster.primary: Still creating... [10s elapsed]
module.prtest.google_container_cluster.primary: Still creating... [20s elapsed]
module.prtest.google_container_cluster.primary: Still creating... [30s elapsed]
module.prtest.google_container_cluster.primary: Still creating... [40s elapsed]
module.prtest.google_container_cluster.primary: Still creating... [50s elapsed]
module.prtest.google_container_cluster.primary: Still creating... [1m0s elapsed]
module.prtest.google_container_cluster.primary: Still creating... [1m10s elapsed]
module.prtest.google_container_cluster.primary: Still creating... [1m20s elapsed]
module.prtest.google_container_cluster.primary: Still creating... [1m30s elapsed]
module.prtest.google_container_cluster.primary: Still creating... [1m40s elapsed]
module.prtest.google_container_cluster.primary: Still creating... [1m50s elapsed]
module.prtest.google_container_cluster.primary: Still creating... [2m0s elapsed]
module.prtest.google_container_cluster.primary: Still creating... [2m10s elapsed]
module.prtest.google_container_cluster.primary: Still creating... [2m20s elapsed]
module.prtest.google_container_cluster.primary: Creation complete after 2m26s [id=projects/content-don-5ea5/locations/us-central1-a/clusters/prtest]
module.prtest.google_container_node_pool.pools["default-node-pool"]: Creating...

Error: error creating NodePool: googleapi: Error 400: Auto_upgrade cannot be false when release_channel REGULAR is set., badRequest

  on .terraform/modules/prtest/modules/beta-private-cluster/cluster.tf line 278, in resource "google_container_node_pool" "pools":
 278: resource "google_container_node_pool" "pools" {

The key here is that it's a zonal cluster. I'm fairly sure that a regional cluster will succeed, because of the logic around the default behavior for auto-upgrading. If I point that original code at my branch instead of the 11.1.0 version of this module, it succeeds.

Copy link
Member

@bharathkkb bharathkkb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I be running make build and committing that in my PR

yeah, those should be committed

module.prtest.google_container_cluster.primary: Still creating... [2m20s elapsed]
module.prtest.google_container_cluster.primary: Creation complete after 2m26s [id=projects/content-don-5ea5/locations/us-central1-a/clusters/prtest]
module.prtest.google_container_node_pool.pools["default-node-pool"]: Creating...

This is interesting, this means that the google_container_cluster has actually successfully spun up (including deleting default pool via remove_default_node_pool) and then moves on to creating your nodepool. This nodepool is the "default" one that is created by this module, although it is managed via a google_container_node_pool. The default_auto_upgrade actually controls the auto_upgrade var in the google_container_node_pool.

Overall this change makes sense to me.

The previous location I placed the comment meant that it was in place
even in non-beta clusters, so the comment was describing behavior that
wasn't actually in place.
@dpetersen
Copy link
Contributor Author

dpetersen commented Sep 23, 2020

Got it, I guess I was confused earlier about which node pool was causing the problem. I have no committed the built files, as well. Thanks for looking at this!

Copy link
Member

@bharathkkb bharathkkb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bharathkkb
Copy link
Member

/gcbrun

@bharathkkb bharathkkb merged commit 21f95db into terraform-google-modules:master Sep 25, 2020
AditModi added a commit to AditModi/terraform-google-kubernetes-engine that referenced this pull request Sep 26, 2020
fix: Enable auto-upgrade in beta clusters with a release channel (terraform-google-modules#682)
CPL-markus pushed a commit to WALTER-GROUP/terraform-google-kubernetes-engine that referenced this pull request Jul 15, 2024
…raform-google-modules#682)

* Enable auto-upgrade in beta clusters with a release channel

Without this setting, a cluster with a default node pool will fail when
a release channel is specified. The API responds with:

Error: error creating NodePool: googleapi: Error 400: Auto_upgrade cannot be false when release_channel REGULAR is set., badRequest

This is noted in the documentation, that use of a release channel
requires auto-upgrade and auto-repair to be enabled:

https://cloud.google.com/kubernetes-engine/docs/concepts/release-channels#new-cluster

* Only include comment in beta block, run make build

The previous location I placed the comment meant that it was in place
even in non-beta clusters, so the comment was describing behavior that
wasn't actually in place.

Co-authored-by: Bharath KKB <bharathkrishnakb@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants