Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Disabling EKS Auto Mode causes failure #40582

Open
miguelhar opened this issue Dec 16, 2024 · 6 comments
Open

[Bug]: Disabling EKS Auto Mode causes failure #40582

miguelhar opened this issue Dec 16, 2024 · 6 comments
Labels
bug Addresses a defect in current functionality. service/eks Issues and PRs that pertain to the eks service.

Comments

@miguelhar
Copy link

miguelhar commented Dec 16, 2024

Terraform Core Version

1.9.8

AWS Provider Version

5.81.0

Affected Resource(s)

aws_eks_cluster

Expected Behavior

Ability to safely disable EKS auto mode from existing cluster

Actual Behavior

Fails with error.

Relevant Error/Panic Output Snippet

Error: updating EKS Cluster (mhtest11) compute config: operation error EKS: UpdateClusterConfig, https response error StatusCode: 400, RequestID: 13ad3dbd-8207-46e0-872a-dd5e331fb01d, InvalidParameterException: The type for cluster update was not provided.

Terraform Configuration Files

resource "aws_eks_cluster" "this" {
  provider = aws.eks

  name                          = local.eks_cluster_name
  role_arn                      = aws_iam_role.eks_cluster.arn
  enabled_cluster_log_types     = ["api", "audit", "authenticator", "controllerManager", "scheduler"]
  version                       = var.eks.k8s_version
  bootstrap_self_managed_addons = false

  encryption_config {
    provider {
      key_arn = local.kms_key_arn
    }
    resources = ["secrets"]
  }

  dynamic "compute_config" {
    for_each = var.eks.auto_mode_enabled ? var.eks.compute_config : {}
    content {
      enabled       = true #var.eks.auto_mode_enabled
      node_pools    = var.eks.compute_config.node_pools
      node_role_arn = aws_iam_role.eks_auto_node_role[0].arn
    }
  }

  kubernetes_network_config {
    ip_family         = "ipv4"
    service_ipv4_cidr = var.eks.service_ipv4_cidr
    dynamic "elastic_load_balancing" {
      for_each = var.eks.auto_mode_enabled ? var.eks.compute_config : {}
      content {
        enabled = var.eks.auto_mode_enabled
      }
    }
  }


  dynamic "storage_config" {
    for_each = var.eks.auto_mode_enabled ? var.eks.compute_config : {}
    content {
      block_storage {
        enabled = true #var.eks.auto_mode_enabled
      }
    }
  }

  upgrade_policy {
    support_type = "EXTENDED"
  }

  access_config {
    authentication_mode                         = var.eks.authentication_mode
    bootstrap_cluster_creator_admin_permissions = true
  }

  vpc_config {
    endpoint_private_access = true
    endpoint_public_access  = var.eks.public_access.enabled
    public_access_cidrs     = var.eks.public_access.cidrs
    security_group_ids      = [aws_security_group.eks_cluster.id]
    subnet_ids              = [for s in var.network_info.subnets.private : s.subnet_id]
  }

  depends_on = [
    aws_iam_role_policy_attachment.eks_cluster,
    aws_security_group_rule.eks_cluster,
    aws_security_group_rule.node,
    aws_cloudwatch_log_group.eks_cluster
  ]

  lifecycle {
    ignore_changes = [
      encryption_config,
      kubernetes_network_config[0].ip_family,
      kubernetes_network_config[0].service_ipv4_cidr
    ]
  }
}

Steps to Reproduce

Create an eks cluster and enable EKS auto mode,
Disable auto mode,
planned changes looks like this

  # module.eks.aws_eks_cluster.this will be updated in-place
  ~ resource "aws_eks_cluster" "this" {
        id                            = "mhtest11"
        name                          = "mhtest11"
        tags                          = {}
        # (12 unchanged attributes hidden)

      - compute_config {
          - enabled       = false -> null
          - node_pools    = [] -> null
            # (1 unchanged attribute hidden)
        }

      - storage_config {
          - block_storage {
              - enabled = false -> null
            }
        }

        # (5 unchanged blocks hidden)
    }

terraform apply causes the error

│ Error: updating EKS Cluster (mhtest11) compute config: operation error EKS: UpdateClusterConfig, https response error StatusCode: 400, RequestID: bfee63d1-66d4-4673-8f23-2317c956a51b, InvalidParameterException: The type for cluster update was not provided.

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

None

@miguelhar miguelhar added the bug Addresses a defect in current functionality. label Dec 16, 2024
Copy link

Community Note

Voting for Prioritization

  • Please vote on this issue by adding a 👍 reaction to the original post to help the community and maintainers prioritize this request.
  • Please see our prioritization guide for information on how we prioritize.
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

Volunteering to Work on This Issue

  • If you are interested in working on this issue, please leave a comment.
  • If this would be your first contribution, please review the contribution guide.

@github-actions github-actions bot added service/eks Issues and PRs that pertain to the eks service. needs-triage Waiting for first response or review from a maintainer. labels Dec 16, 2024
@justinretzolk justinretzolk removed the needs-triage Waiting for first response or review from a maintainer. label Dec 18, 2024
@Idan-Lazar
Copy link

There is a solution for it?

@flostadler
Copy link
Contributor

I did some debugging and I think this is caused by kubernetes_network_config being marked as computed and optional. Such attributes will preserve their known prior state if a configuration value changes to null or a zero-value of the type (caused by hashicorp/terraform-plugin-sdk#1101).

So when you remove the kubernetes_network_config, compute_config and storage_config blocks it'll still use the prior state for kubernetes_network_config (with elastic_load_balancing.enabled set to true).

@brianpham
Copy link

Does anyone have a workaround for this yet?

@flostadler
Copy link
Contributor

flostadler commented Jan 29, 2025

So there actually seem to be multiple issues involved here when disabling auto mode:

  1. If you remove all the auto mode config blocks, Change not triggered for optional / computed string field when set to empty string terraform-plugin-sdk#1101 strikes and kubernetes_network_config resolves to the previous value with enabled = true => Error: compute_config.enabled, kubernetes_networking_config.elastic_load_balancing.enabled, and storage_config.block_storage.enabled must all be set to either true or false
  2. If you modify any of the compute_config options (e.g. removing node_role_arn) while setting enabled = false it incorrectly marks it as a replacement.
  3. If you try to work around the previous two issues by removing the compute_config and storage_config blocks while setting kubernetes_network_config.elastic_load_balancing.enabled = false you get an API error from AWS: InvalidParameterException: The type for cluster update was not provided. This seems to be caused by the request not including the properties for compute_config and storage_config.

So the only way I can see right now for disabling auto mode is to do the following:

  1. Set enabled = false for all auto mode config options. Do not modify any other auto mode config (e.g. node_role_arn)
  2. Now you cannot remove the node_role because of issues 1 & 3 above, but at least auto mode is disabled.

I started working on a fix for the The type for cluster update was not provided error.

flostadler added a commit to pulumi/pulumi-eks that referenced this issue Jan 29, 2025
The diff customizer of the upstream provider is not taking possibly
unknown values into account (see
https://github.com/hashicorp/terraform-provider-aws/blob/ae93494f39ba70fe442e891caf05f8df21bde1ac/internal/service/eks/cluster.go#L1776-L1791),
which causes failures like this one:
```
* compute_config.enabled, kubernetes_networking_config.elastic_load_balancing.enabled, and storage_config.block_storage.enabled must all be set to either true or false
```
This happens because `unknown` attributes just default to their empty
values in a diff customizer if they're unknown .

This change acts a hot fix until we can properly fix upstream's behavior
here. The current diff for auto mode is quite broken and needs some more
involved fixes (see
hashicorp/terraform-provider-aws#40582).

Fixes #1597
@flostadler
Copy link
Contributor

Created a PR that aims to fix the The type for cluster update was not provided error: #41155

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Addresses a defect in current functionality. service/eks Issues and PRs that pertain to the eks service.
Projects
None yet
Development

No branches or pull requests

5 participants