Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading to v21.0.0 forces recreation of node pools #1268

Closed
mkjmdski opened this issue May 25, 2022 · 5 comments
Closed

Upgrading to v21.0.0 forces recreation of node pools #1268

mkjmdski opened this issue May 25, 2022 · 5 comments
Labels
bug Something isn't working docs triaged Scoped and ready for work

Comments

@mkjmdski
Copy link

TL;DR

enable_gcfs being added to state forces terraform to rebuild node pools

Expected behavior

Updating version of module should not rebuild node pools

Observed behavior

Terraform needs to rebuild the node pools

Terraform Configuration

source  = "terraform-google-modules/kubernetes-engine/google//modules/beta-private-cluster"
  version = "21.0.0"

Terraform Version

Terraform v1.1.9
on linux_amd64
+ provider registry.terraform.io/hashicorp/google v4.22.0
+ provider registry.terraform.io/hashicorp/google-beta v4.22.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.11.0
+ provider registry.terraform.io/hashicorp/null v3.1.1
+ provider registry.terraform.io/hashicorp/random v3.2.0

Additional information

No response

@mkjmdski mkjmdski added the bug Something isn't working label May 25, 2022
@bharathkkb
Copy link
Member

Thanks for the report @mkjmdski
This will be fixed in 21.1.0 #1251
Could you temporarily try with the main branch?

@Flektoma
Copy link

Hi, I have the same issue but caused by keepers that were added in 21.0.0 #1218

Having node pool without enable_gcfs specified - thus defaulting to "" adds this field to keepers which triggers node pool recreation.

This happens while upgrading from v20.0.0 to v21.1.0 while using beta-private-cluster-update-variant module.

# module.cluster.module.gke.module.gke.random_id.name["pool2"] must be replaced
+/- resource "random_id" "name" {
      ~ b64_std     = "pool2-skU=" -> (known after apply)
      ~ b64_url     = "pool2-skU" -> (known after apply)
      ~ dec         = "pool2-45637" -> (known after apply)
      ~ hex         = "pool2-b245" -> (known after apply)
      ~ id          = "skU" -> (known after apply)
      ~ keepers     = { # forces replacement
          + "enable_gcfs"       = ""
            # (15 unchanged elements hidden)
        }
        # (2 unchanged attributes hidden)
    }

I am not sure if I can somehow override this or what the proper fix is here.

@gorge511
Copy link
Contributor

Hi, it seems that there is no other way how to avoid node pool recreation than updating the state file manually.

That's why @Flektoma and I developed this following jqcommand:

jq -a '(.resources[] | select((.module // "" | endswith("module.gke.module.gke")) and (.type == "random_id")) | .instances[].attributes.keepers) |= (. + {enable_gcfs: ""})' default.tfstate > default.tfstate.new

Just please check the diff before uploading the state file back to your backend if all the changes are valid.

@bharathkkb
Copy link
Member

@Flektoma Unfortunately for the update variant I don't think there is a way to this natively other than editing the state to add the new keeper attribute. @gorge511 Would you like to add this to our upgrade guide for future users who stumble on this?

@bharathkkb bharathkkb added triaged Scoped and ready for work docs labels Jun 21, 2022
@gorge511
Copy link
Contributor

@gorge511 Would you like to add this to our upgrade guide for future users who stumble on this?

@bharathkkb can you please suggest to which file I should put it? Basically where you, as a user, will try to look for such information? New file in /docs folder. I see this more to be part of some troubleshooting guide (but I didn't find any). Because it is not specific to any module version upgrade. It will be a reoccurring issue. It's there again with version 21.2.0 and a new keeper for the enable_secure_boot variable added in #1277.

@mkjmdski mkjmdski closed this as completed Sep 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working docs triaged Scoped and ready for work
Projects
None yet
Development

No branches or pull requests

4 participants