fix: Forces cluster outputs to wait until access entries are complete #3000

lorengordon · 2024-04-08T21:30:31Z

Description

This changeset adds depends_on to a few cluster outputs. This will tell terraform that these cluster outputs should not be made "known" and available until all the EKS access entries have completed. The outputs selected in this PR are limited to those used for authenticating to the EKS cluster. If it turns out more are needed, it is easy enough to add them, and I am not aware of any further repercussions to this particular usage of depends_on.

Motivation and Context

Resolves Access Entries are deleted before Helm charts so helm charts stays orphan in the terraform state #2923

Without this change, it can be difficult to order the terraform lifecycle on create and destroy when also using the helm, kubectl, and/or kubernetes providers. It requires using state rm and -target to get things to complete successfully. With this change, you can just apply and destroy normally. See also the comment and issue thread here:

#2923 (comment)

Breaking Changes

None

How Has This Been Tested?

I have updated at least one of the examples/* to demonstrate and validate my change(s)
I have tested and validated these changes using one or more of the provided examples/* projects

I have executed pre-commit run -a on my pull request

I also ran terraform apply -auto-approve && terraform destroy -auto-approve on the "complete" example in the blueprints-addons project, three times successfully, without using any state rm or target steps.

lorengordon · 2024-04-08T21:32:48Z

Bloody pre-commit hooks, I hates them so much. Fine, hold on, waiting on a test of the change to finish up (EKS is so slow!)...

Edit: Ok, so it was just the pr title lol...

bryantbiggs · 2024-04-08T21:33:53Z

could you also remove these commands from the example READMEs since they aren't needed anymore (thank goodness)

# Necessary to avoid removing Terraform's permissions too soon before its finished
# cleaning up the resources it deployed inside the cluster
terraform state rm 'module.eks.aws_eks_access_entry.this["cluster_creator"]' || true
terraform state rm 'module.eks.aws_eks_access_policy_association.this["cluster_creator_admin"]' || true

bryantbiggs

thank you!

## [20.8.5](v20.8.4...v20.8.5) (2024-04-08) ### Bug Fixes * Forces cluster outputs to wait until access entries are complete ([#3000](#3000)) ([e2a39c0](e2a39c0))

antonbabenko · 2024-04-08T22:46:28Z

This PR is included in version 20.8.5 🎉

llavaud · 2024-04-10T12:18:19Z

It seems permissions are not completely propagated when Terraform say "Creation completed", I had to set an explicit dependency on the EKS module to try creating kubernetes resources after the 30s wait time...

lorengordon · 2024-04-10T13:21:25Z

@llavaud I had seen some funny business on apply when using this eks module in conjunction with the eks-blueprints-addons module, and specifically the helm_release resources. I'm not sure the cause is necessarily the same problem as what this patch fixed though. I think that might be a separate issue, perhaps pre-existing, perhaps revealed by fixing this one. Since destroy is now clean, but somehow apply sometimes requires a second try. Not consistently though. Some kind of race condition, maybe. Or perhaps the helm provider is mishandling its own attributes and authenticating waaaaay early? Or perhaps some attribute is not being threaded through the config as needed for ordering correctly.

Module-level depends_on should be avoided at all cost, since it will cause the dependency to be recreated on any change to the module. Better to just apply twice, until we figure out what is really happening...

bryantbiggs · 2024-04-10T14:07:03Z

access entry creation is asynchronous and therefore should be treated as eventually consistent. However, Terraform operates in very much a synchronous manner. Deploying applications via Terraform is not recommended, but the 2nd best alternative is for the providers to implement retries similar to the kubectl provider

lorengordon · 2024-04-10T14:35:18Z

@bryantbiggs Any idea if there is an issue for that on the helm provider? Do you have a ref to how the kubectl provider is doing it?

bryantbiggs · 2024-04-10T14:49:09Z

Most likely, and here is the argument https://registry.terraform.io/providers/gavinbunney/kubectl/latest/docs#apply_retry_count

lorengordon · 2024-04-10T14:58:30Z

Ahhh, that's what that argument is doing for me (in this config I inherited)! 😁

llavaud · 2024-04-10T15:07:43Z

@lorengordon I just create somes resources using the kubernetes provider
@bryantbiggs make sense and explain my issue, thanks !

lorengordon · 2024-04-10T15:15:35Z

can the kubectl provider apply helm charts? hmm....

lorengordon · 2024-04-10T15:19:27Z

Here's an issue to follow on retry support in the helm provider... hashicorp/terraform-provider-helm#1058

lorengordon · 2024-04-11T22:32:44Z

Ok, I think the problem is that the cluster outputs are still becoming available before fargate and node group resources are complete, and providers that deploy resources into those node groups are starting waaay too early. I've got a branch I'm testing that just adds those modules to the same outputs we did in this PR. If that works cleanly a few times, I'll open a PR. Probably get to it Monday...

bryantbiggs · 2024-04-11T22:50:33Z

becoming available before fargate and node group resources are complete

Thats fine - Kubernetes handles that. Terraform, however, does not when you have things like wait = true on the Helm resource. We most likely will not accept changes related to this because:

Hashicorp advises against chaining providers
Terraform is not an app deployment tool and will always fall short here
Too many eggs in one basket (cluster and apps) is not recommended - large blast radius and non-ideal workflow

lorengordon · 2024-04-11T23:17:51Z

That's all well and good and technically correct I suppose. Hard to argue with. However, the eks-blueprints project, and the examples in this project, both really need to discuss how to actually do that then. It feels like there's a lot of configs that are really misleading about how they work and whether they're even a good idea. Or we could just fix it so it actually works.

Anyway, while my other hypothesis may be one problem. There is another race condition causing the node group to be created before the ENI Config is ready, which results in this error in the EKS console:

container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized

And this error in /var/log/aws-routed-eni/ipamd.log:

{"level":"info","ts":"2024-04-11T23:03:08.919Z","caller":"ipamd/ipamd.go:848","msg":"Found ENI Config Name: us-east-1b"}
{"level":"error","ts":"2024-04-11T23:03:09.020Z","caller":"ipamd/ipamd.go:848","msg":"error while retrieving eniconfig: ENIConfig.crd.k8s.amazonaws.com \"us-east-1b\" not found"}
{"level":"error","ts":"2024-04-11T23:03:09.020Z","caller":"ipamd/ipamd.go:824","msg":"Failed to get pod ENI config"}
{"level":"debug","ts":"2024-04-11T23:03:09.020Z","caller":"ipamd/ipamd.go:566","msg":"Error trying to allocate ENI: eniconfig: eniconfig is not available"}
{"level":"error","ts":"2024-04-11T23:03:09.020Z","caller":"aws-k8s-agent/main.go:42","msg":"Initialization failure: Failed to attach any ENIs for custom networking"}

That's even with the vpc-cni configured for "before_compute"...

  cluster_addons = {
    vpc-cni = {
      addon_version  = "v1.18.0-eksbuild.1"
      before_compute = true

      configuration_values = jsonencode({
        env = {
          # Reference docs https://docs.aws.amazon.com/eks/latest/userguide/cni-increase-ip-addresses.html
          ENABLE_PREFIX_DELEGATION = "true"
          WARM_PREFIX_TARGET       = "1"
        }
      })
    }
  }

bryantbiggs · 2024-04-12T00:13:54Z

If you are using the ENIConfig for secondary CIDR, we just shipped https://aws.amazon.com/about-aws/whats-new/2024/04/amazon-vpc-cni-automatic-subnet-discovery/ which I believe removes the need to set/use the ENIConfig

lorengordon · 2024-04-12T14:16:43Z

I'm reading that article, and the blog post, and trying to understand how to make it actionable in the context of this module, but I think I'm really too green with EKS/k8s to understand what it's saying. I inherited this EKS config, and I'm just trying to make it work more reliably. I know terraform really quite well, and know how to manipulate the DAG to make these implicit dependencies more explicit. But EKS/k8s I do not know at all. :(

bryantbiggs · 2024-04-12T14:34:01Z

tl;dr - before you might create ENIConfigs like:

apiVersion: crd.k8s.amazonaws.com/v1alpha1
kind: ENIConfig
metadata: 
  name: $az_1
spec: 
  securityGroups: 
    - $cluster_security_group_id
  subnet: $new_subnet_id_1

But now, all you need to do is tag your secondary subnets with:

  tags = {
    "kubernetes.io/role/cni" = 1
  }

And the VPC CNI will discover these subnets and start utilizing them

lorengordon · 2024-04-15T13:29:38Z

Aha, that was the clue I needed. Ok, thanks @bryantbiggs, sorry I was being so dense! Now I see what is causing the dependency ordering issue. The config I inherited is using the kubectl provider to create that ENIConfig, and the change I was proposing is preventing it from starting until after the node group is created. But the node group config requires that resource before it starts in order to complete successfully. Something like this, then:

Create Cluster (aws)
Create CNI addon (aws)
Create ENIconfig (kubectl)
Create Node Group (aws)
Create everything else, including remaining addons and helm releases from the blueprints-addons module (aws/helm/kubernetes)

I'll see what I can do using the terraform_data resource in my own config to create the dependencies on the provider config inputs, instead of modifying the output dependencies here. I'm pretty sure I can force it to order everything correctly. Only explicit dependency missing then would be some way to create the graph edge from the kubectl resource to the node group resource. It seems currently that is handled just with a waiter (dataplane_wait_duration), rather than an attribute of some kind. But I think I have enough to get myself unblocked at least!

bryantbiggs · 2024-04-15T14:29:08Z

Just to be clear, you do not need the ENIconfig if you use the recently released subnet discoverability feature as defined in this and its subsequent links

lorengordon · 2024-04-15T15:05:23Z

Yep understood. It may be a little academic at this point, but since I thought of it now I just want to see if I can influence the timing of the providers and their resources using terraform_data lol. And perhaps I can also create an explicit graph edge for the node group by threading a tag value from the kubectl resource to the node group tags...

lorengordon · 2024-04-16T17:35:43Z

Nifty, yeah this seems to work just fine... For the kubectl provider, just use the module outputs since they will be available before the node group. And for other providers, thread the inputs through terraform_data in a way that links them to the compute option used in the config (or just use all of them, as I did below). Then all resources that depend on the node group will be created after the node group is ready. And if you need some things to kick off before compute and others after compute, for the same provider type, just use two providers with a provider alias.

# Use terraform_data inputs to create resources after compute
provider "kubernetes" {
  host                   = terraform_data.eks_cluster_after_compute.input.cluster_endpoint
  cluster_ca_certificate = base64decode(terraform_data.eks_cluster_after_compute.input.cluster_certificate_authority_data)
  token                  = data.aws_eks_cluster_auth.this.token
}

# Use terraform_data inputs to create resources after compute
provider "helm" {
  kubernetes {
    host                   = terraform_data.eks_cluster_after_compute.input.cluster_endpoint
    cluster_ca_certificate = base64decode(terraform_data.eks_cluster_after_compute.input.cluster_certificate_authority_data)
    token                  = data.aws_eks_cluster_auth.this.token
  }
}

# Use module outputs directly to create resources before compute
provider "kubectl" {
  apply_retry_count      = 10
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
  load_config_file       = false
  token                  = data.aws_eks_cluster_auth.this.token
}

# Force a dependency between the EKS cluster auth inputs, and the compute resources
resource "terraform_data" "eks_cluster_after_compute" {
  input = {
    cluster_endpoint                   = module.eks.cluster_endpoint
    cluster_certificate_authority_data = module.eks.cluster_certificate_authority_data

    fargate_profiles         = module.eks.fargate_profiles
    eks_managed_node_groups  = module.eks.eks_managed_node_groups
    self_managed_node_groups = module.eks.self_managed_node_groups
  }
}

And it does actually work to thread tags back through from the kubectl_manifest to the node group. This will create the graph edge between the kubectl resource and the node group, so terraform can order actions correctly.

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "~> 20.8.5"
  # ...
  eks_managed_node_groups = {
    foo = {
      # ...
      tags = merge(
        { for label, config in kubectl_manifest.eni_config : "eni-config/${label}" => config.id },
        local.tags
      )
    }
  }
}

Though, for the moment, I chose not to thread the tags through that way because of the potential for recursive references, instead just relying on dataplane_wait_duration. If I really wanted to establish this graph edge, I would probably not use the top-level node-group/compute options in the module, and instead separately call the node-group submodule. Makes the dependencies easier to visualize and manipulate.

github-actions · 2024-05-17T02:03:41Z

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

Forces cluster outputs to wait until access entries are complete

1578ecf

lorengordon mentioned this pull request Apr 8, 2024

Access Entries are deleted before Helm charts so helm charts stays orphan in the terraform state #2923

Closed

lorengordon changed the title ~~Forces cluster outputs to wait until access entries are complete~~ fix: Forces cluster outputs to wait until access entries are complete Apr 8, 2024

Removes outdated state rm commands from readme examples

079470b

bryantbiggs approved these changes Apr 8, 2024

View reviewed changes

bryantbiggs merged commit e2a39c0 into terraform-aws-modules:master Apr 8, 2024
19 checks passed

antonbabenko pushed a commit that referenced this pull request Apr 8, 2024

chore(release): version 20.8.5 [skip ci]

afadb14

## [20.8.5](v20.8.4...v20.8.5) (2024-04-08) ### Bug Fixes * Forces cluster outputs to wait until access entries are complete ([#3000](#3000)) ([e2a39c0](e2a39c0))

lorengordon deleted the outputs-with-dependencies branch April 8, 2024 23:27

bryantbiggs mentioned this pull request Apr 20, 2024

fix: Update EKS module min version to fix race condition during destroy aws-ia/terraform-aws-eks-blueprints#1926

Merged

3 tasks

github-actions bot locked as resolved and limited conversation to collaborators May 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Forces cluster outputs to wait until access entries are complete #3000

fix: Forces cluster outputs to wait until access entries are complete #3000

lorengordon commented Apr 8, 2024 •

edited by bryantbiggs

Loading

lorengordon commented Apr 8, 2024 •

edited

Loading

bryantbiggs commented Apr 8, 2024

bryantbiggs left a comment

antonbabenko commented Apr 8, 2024

llavaud commented Apr 10, 2024 •

edited

Loading

lorengordon commented Apr 10, 2024 •

edited

Loading

bryantbiggs commented Apr 10, 2024

lorengordon commented Apr 10, 2024

bryantbiggs commented Apr 10, 2024

lorengordon commented Apr 10, 2024

llavaud commented Apr 10, 2024

lorengordon commented Apr 10, 2024

lorengordon commented Apr 10, 2024

lorengordon commented Apr 11, 2024

bryantbiggs commented Apr 11, 2024

lorengordon commented Apr 11, 2024 •

edited

Loading

bryantbiggs commented Apr 12, 2024

lorengordon commented Apr 12, 2024

bryantbiggs commented Apr 12, 2024

lorengordon commented Apr 15, 2024

bryantbiggs commented Apr 15, 2024

lorengordon commented Apr 15, 2024 •

edited

Loading

lorengordon commented Apr 16, 2024

github-actions bot commented May 17, 2024

fix: Forces cluster outputs to wait until access entries are complete #3000

fix: Forces cluster outputs to wait until access entries are complete #3000

Conversation

lorengordon commented Apr 8, 2024 • edited by bryantbiggs Loading

Description

Motivation and Context

Breaking Changes

How Has This Been Tested?

lorengordon commented Apr 8, 2024 • edited Loading

bryantbiggs commented Apr 8, 2024

bryantbiggs left a comment

Choose a reason for hiding this comment

antonbabenko commented Apr 8, 2024

llavaud commented Apr 10, 2024 • edited Loading

lorengordon commented Apr 10, 2024 • edited Loading

bryantbiggs commented Apr 10, 2024

lorengordon commented Apr 10, 2024

bryantbiggs commented Apr 10, 2024

lorengordon commented Apr 10, 2024

llavaud commented Apr 10, 2024

lorengordon commented Apr 10, 2024

lorengordon commented Apr 10, 2024

lorengordon commented Apr 11, 2024

bryantbiggs commented Apr 11, 2024

lorengordon commented Apr 11, 2024 • edited Loading

bryantbiggs commented Apr 12, 2024

lorengordon commented Apr 12, 2024

bryantbiggs commented Apr 12, 2024

lorengordon commented Apr 15, 2024

bryantbiggs commented Apr 15, 2024

lorengordon commented Apr 15, 2024 • edited Loading

lorengordon commented Apr 16, 2024

github-actions bot commented May 17, 2024

lorengordon commented Apr 8, 2024 •

edited by bryantbiggs

Loading

lorengordon commented Apr 8, 2024 •

edited

Loading

llavaud commented Apr 10, 2024 •

edited

Loading

lorengordon commented Apr 10, 2024 •

edited

Loading

lorengordon commented Apr 11, 2024 •

edited

Loading

lorengordon commented Apr 15, 2024 •

edited

Loading