Cannot upgrade cluster version from 1.16 to 1.17 #1003

randrusiak · 2020-09-08T09:30:01Z

I have issues

Hi there!

I have a problem with upgrading a cluster to the latest version of EKS. After changing cluster_version to 1.17 and submitting terraform apply I got an error:

Error: Cycle: module.eks.random_pet.workers_launch_template[1] (destroy deposed 08cff672), module.eks.random_pet.workers_launch_template[0] (destroy deposed 7972b6ab), module.eks (close)

I'm wondering if it is caused by terraform 0.13.1?

I've tried to do upgrade with 0.12.29 but I can't refresh state with the older version of terraform.

I'm submitting a...

bug report
feature request
support request - read the FAQ first!
kudos, thank you, warm fuzzy

What is the current behavior?

I described it above.

If this is a bug, how to reproduce? Please include a code sample if relevant.

I just change cluster_version to 1.17 in following resource

module "eks" {
  source                                             = "terraform-aws-modules/eks/aws"
  version                                            = "12.2.0"
  cluster_name                                       = var.cluster_name
  cluster_version                                    = "1.16"
  subnets                                            = module.vpc.private_subnets
  enable_irsa                                        = true # Whether to create OpenID Connect Provider for EKS to enable IRSA
  config_output_path                                 = "./"
  write_kubeconfig                                   = true
  cluster_enabled_log_types                          = ["api", "audit", "authenticator", "controllerManager", "scheduler"]
  cluster_log_retention_in_days                      = 30
  cluster_endpoint_private_access                    = true
  cluster_endpoint_public_access_cidrs               = var.cluster_endpoint_public_access_cidrs
  worker_create_security_group                       = true
  worker_create_cluster_primary_security_group_rules = true

  tags = {
    "k8s.io/cluster-autoscaler/${var.cluster_name}" = "owned"
    "k8s.io/cluster-autoscaler/enabled"             = "true"
  }

  vpc_id = module.vpc.vpc_id

  worker_groups_launch_template = [
    {
      name                    = "generic-spot-workers-01"
      public_ip               = false
      enable_monitoring       = false
      root_volume_size        = 10
      override_instance_types = ["m5.large", "m5a.large"]
      spot_price              = "0.115" # maximum price equals current price for ondemand m5.large instance
      asg_max_size            = 5
      asg_desired_capacity    = 1
      kubelet_extra_args      = "--node-labels=node.kubernetes.io/lifecycle=spot,NodePurpose=generic"
    },
    {
      name                                     = "on-demand-workers-01"
      public_ip                                = false
      enable_monitoring                        = false
      root_volume_size                         = 10
      override_instance_types                  = ["m5.large", "m5a.large"]
      asg_max_size                             = 1
      asg_desired_capacity                     = 1
      on_demand_percentage_above_base_capacity = 0 # for testing purpose use only spot instances, after testing set 100 
      kubelet_extra_args                       = "--node-labels=node.kubernetes.io/lifecycle=normal --register-with-taints=Lifecycle=normal:PreferNoSchedule"
    },
  ]

  map_users = [
    for key, value in merge(var.cluster_users, var.existing_cluster_users) :
    {
      userarn  = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:user/${key}"
      username = key
      groups   = value.groups
    }
  ]
}

Here is a plan output:

  # module.eks.aws_eks_cluster.this[0] will be updated in-place
  ~ resource "aws_eks_cluster" "this" {
        arn                       = "arn:aws:eks:eu-central-1:00000:cluster/some-cluster"
        certificate_authority     = [
            {
                data = "xxxx"
            },
        ]
        created_at                = "2020-09-08 05:20:37.78 +0000 UTC"
        enabled_cluster_log_types = [
            "api",
            "audit",
            "authenticator",
            "controllerManager",
            "scheduler",
        ]
        endpoint                  = "https://some-endpoint.gr7.eu-central-1.eks.amazonaws.com"
        id                        = "some-cluster"
        identity                  = [
            {
                oidc = [
                    {
                        issuer = "https://oidc.eks.eu-central-1.amazonaws.com/id/00000000"
                    },
                ]
            },
        ]
        name                      = "some-cluster"
        platform_version          = "eks.3"
        role_arn                  = "arn:aws:iam::00000000:role/some-cluster20200908052034122400000001"
        status                    = "ACTIVE"
        tags                      = {
            "k8s.io/cluster-autoscaler/some-cluster" = "owned"
            "k8s.io/cluster-autoscaler/enabled"        = "true"
        }
      ~ version                   = "1.16" -> "1.17"

        timeouts {
            create = "30m"
            delete = "15m"
        }

        vpc_config {
            cluster_security_group_id = "sg-0402dd9a48f3af3cb"
            endpoint_private_access   = true
            endpoint_public_access    = true
            public_access_cidrs       = [
                "0.0.0.0/0",
            ]
            security_group_ids        = [
                "sg-045bae69b359f1725",
            ]
            subnet_ids                = [
                "subnet-026e8145f409986ee",
                "subnet-0633454877ad17cf6",
                "subnet-064ef1da8021a688d",
            ]
            vpc_id                    = "vpc-0a50477dfcb7a82dc"
        }
    }

  # module.eks.aws_launch_template.workers_launch_template[0] will be updated in-place
  ~ resource "aws_launch_template" "workers_launch_template" {
        arn                     = "arn:aws:ec2:eu-central-1:xxxxx:launch-template/lt-06595302f6ba38e7a"
        default_version         = 1
        disable_api_termination = false
        ebs_optimized           = "true"
        id                      = "lt-06595302f6ba38e7a"
      ~ image_id                = "ami-0b2edbf190fe05b92" -> "ami-047e3ad49b70ed809"
        instance_type           = "m4.large"
      ~ latest_version          = 1 -> (known after apply)
        name                    = "some-cluster-generic-spot-workers-012020090805351684350000000d"
        name_prefix             = "some-cluster-generic-spot-workers-01"
        security_group_names    = []
        tags                    = {
            "k8s.io/cluster-autoscaler/some-cluster" = "owned"
            "k8s.io/cluster-autoscaler/enabled"        = "true"
        }
        user_data               = "xxx="
        vpc_security_group_ids  = []

        block_device_mappings {
            device_name = "/dev/xvda"

            ebs {
                delete_on_termination = "true"
                encrypted             = "false"
                iops                  = 0
                volume_size           = 10
                volume_type           = "gp2"
            }
        }

        credit_specification {
            cpu_credits = "standard"
        }

        iam_instance_profile {
            name = "some-cluster2020090805324091800000000b"
        }

        metadata_options {
            http_endpoint               = "enabled"
            http_put_response_hop_limit = 0
            http_tokens                 = "optional"
        }

        monitoring {
            enabled = false
        }

        network_interfaces {
            associate_public_ip_address = "false"
            delete_on_termination       = "true"
            device_index                = 0
            ipv4_address_count          = 0
            ipv4_addresses              = []
            ipv6_address_count          = 0
            ipv6_addresses              = []
            security_groups             = [
                "sg-0dbbed1cc6fbed9e1",
            ]
        }

        tag_specifications {
            resource_type = "volume"
            tags          = {
                "Name"                                     = "some-cluster-generic-spot-workers-01-eks_asg"
                "k8s.io/cluster-autoscaler/some-cluster" = "owned"
                "k8s.io/cluster-autoscaler/enabled"        = "true"
            }
        }
        tag_specifications {
            resource_type = "instance"
            tags          = {
                "Name"                                     = "some-cluster-generic-spot-workers-01-eks_asg"
                "k8s.io/cluster-autoscaler/some-cluster" = "owned"
                "k8s.io/cluster-autoscaler/enabled"        = "true"
            }
        }
    }

  # module.eks.aws_launch_template.workers_launch_template[1] will be updated in-place
  ~ resource "aws_launch_template" "workers_launch_template" {
        arn                     = "arn:aws:ec2:eu-central-1:xxxxx:launch-template/lt-025fdfba223ae7492"
        default_version         = 1
        disable_api_termination = false
        ebs_optimized           = "true"
        id                      = "lt-025fdfba223ae7492"
      ~ image_id                = "ami-0b2edbf190fe05b92" -> "ami-047e3ad49b70ed809"
        instance_type           = "m4.large"
      ~ latest_version          = 1 -> (known after apply)
        name                    = "some-cluster-on-demand-workers-012020090805351698020000000f"
        name_prefix             = "some-cluster-on-demand-workers-01"
        security_group_names    = []
        tags                    = {
            "k8s.io/cluster-autoscaler/some-cluster" = "owned"
            "k8s.io/cluster-autoscaler/enabled"        = "true"
        }
        user_data               = "xxxx="
        vpc_security_group_ids  = []

        block_device_mappings {
            device_name = "/dev/xvda"

            ebs {
                delete_on_termination = "true"
                encrypted             = "false"
                iops                  = 0
                volume_size           = 10
                volume_type           = "gp2"
            }
        }

        credit_specification {
            cpu_credits = "standard"
        }

        iam_instance_profile {
            name = "some-cluster2020090805324229910000000c"
        }

        metadata_options {
            http_endpoint               = "enabled"
            http_put_response_hop_limit = 0
            http_tokens                 = "optional"
        }

        monitoring {
            enabled = false
        }

        network_interfaces {
            associate_public_ip_address = "false"
            delete_on_termination       = "true"
            device_index                = 0
            ipv4_address_count          = 0
            ipv4_addresses              = []
            ipv6_address_count          = 0
            ipv6_addresses              = []
            security_groups             = [
                "sg-0dbbed1cc6fbed9e1",
            ]
        }

        tag_specifications {
            resource_type = "volume"
            tags          = {
                "Name"                                     = "some-cluster-on-demand-workers-01-eks_asg"
                "k8s.io/cluster-autoscaler/some-cluster" = "owned"
                "k8s.io/cluster-autoscaler/enabled"        = "true"
            }
        }
        tag_specifications {
            resource_type = "instance"
            tags          = {
                "Name"                                     = "some-cluster-on-demand-workers-01-eks_asg"
                "k8s.io/cluster-autoscaler/some-cluster" = "owned"
                "k8s.io/cluster-autoscaler/enabled"        = "true"
            }
        }
    }

  # module.eks.random_pet.workers_launch_template[0] must be replaced
+/- resource "random_pet" "workers_launch_template" {
      ~ id        = "related-bobcat" -> (known after apply)
      ~ keepers   = {
          - "lt_name" = "some-cluster-generic-spot-workers-012020090805351684350000000d-1"
        } -> (known after apply) # forces replacement
        length    = 2
        separator = "-"
    }

  # module.eks.random_pet.workers_launch_template[1] must be replaced
+/- resource "random_pet" "workers_launch_template" {
      ~ id        = "live-buck" -> (known after apply)
      ~ keepers   = {
          - "lt_name" = "some-cluster-on-demand-workers-012020090805351698020000000f-1"
        } -> (known after apply) # forces replacement
        length    = 2
        separator = "-"
    }

What's the expected behavior?

I expect that I'm able to upgrade cluster version without any errors.

Are you able to fix this problem and submit a PR? Link here if you have already.

Unfortunately, I'm not.

Environment details

Affected module version: 12.2.0
OS: Ubuntu 20.04
Terraform version: 0.13.1

Any other relevant info

The text was updated successfully, but these errors were encountered:

dpiddockcmp · 2020-09-09T08:23:06Z

0.13 is giving us lots of problems. This is a slightly different error to what we've seen before when changes to the launch template were required #939

I'm not sure how module.eks depends on the random_pets

randrusiak · 2020-09-09T11:11:50Z

So should I revert to the older version of terraform? In general, using old version is not a problem for me but I don't know how to refresh state to the older version. Do you know how to do that?

randrusiak · 2020-09-11T12:55:01Z

For everyone who has a similar problem, don't try to downgrading remote state according to documentation (https://support.hashicorp.com/hc/en-us/articles/360001147287-Downgrading-Terraform) in my case it didn't work.
If you have versioning enabled for your remote backend just restore to oldest version it will save your time :)

@dpiddockcmp
I have already upgrade EKS from 1.16 to 1.17 with terraform 0.12.29 without any issues. So I think currently is better to stay with older version of terraform. Maybe you should add a warning about that in readme?

barryib · 2020-11-11T21:43:05Z

@randrusiak Can you check if you're still experiencing this issue with the latest version of this module and TF >= 0.13.4 ? We fixed some cycle issue for random pets.

FWIW, random pet destroy shouldn't destroy anything (by default). They have been added for ASG recreation when LT/LC change. But only when asg_recreate_on_change is set to true.

randrusiak · 2020-11-18T11:09:31Z

@barryib I will try to test latest version of module by the end of the week and I'll let you know.

randrusiak · 2020-11-22T15:31:50Z

@barryib I've checked upgrading process with latest versions and issue hasn't occurred.
I've tried to do the same with older versions which I reported, and I couldn't reproduce this bug. So I'm a little confused but we can assume that problem is solved.

github-actions · 2022-11-23T02:21:50Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

randrusiak closed this as completed Nov 22, 2020

github-actions bot locked as resolved and limited conversation to collaborators Nov 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot upgrade cluster version from 1.16 to 1.17 #1003

Cannot upgrade cluster version from 1.16 to 1.17 #1003

randrusiak commented Sep 8, 2020

dpiddockcmp commented Sep 9, 2020

randrusiak commented Sep 9, 2020

randrusiak commented Sep 11, 2020 •

edited

Loading

barryib commented Nov 11, 2020

randrusiak commented Nov 18, 2020

randrusiak commented Nov 22, 2020

github-actions bot commented Nov 23, 2022

Cannot upgrade cluster version from 1.16 to 1.17 #1003

Cannot upgrade cluster version from 1.16 to 1.17 #1003

Comments

randrusiak commented Sep 8, 2020

I have issues

I'm submitting a...

What is the current behavior?

If this is a bug, how to reproduce? Please include a code sample if relevant.

What's the expected behavior?

Are you able to fix this problem and submit a PR? Link here if you have already.

Environment details

Any other relevant info

dpiddockcmp commented Sep 9, 2020

randrusiak commented Sep 9, 2020

randrusiak commented Sep 11, 2020 • edited Loading

barryib commented Nov 11, 2020

randrusiak commented Nov 18, 2020

randrusiak commented Nov 22, 2020

github-actions bot commented Nov 23, 2022

randrusiak commented Sep 11, 2020 •

edited

Loading