To spin up complete eks with all necessary components. Those include:
- vpc (NOTE: the vpc submodule moved into separate repo https://github.com/dasmeta/terraform-aws-vpc)
- eks cluster
- alb ingress controller
- fluentbit
- external secrets
- metrics to cloudwatch
- from <2.19.0 to >=2.19.0 version needs some manual actions as we upgraded underlying eks module from 18.x.x to 20.x.x,
here you can find needed actions/changes docs and ready scripts which can be used:
docs:
https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/UPGRADE-19.0.md
https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/UPGRADE-20.0.md
params:
The node group create_launch_template=false and launch_template_name="" pair params have been replaced with use_custom_launch_template=false
scripts:
# commands to move some states, run before applying the `terraform apply` for new version terraform state mv "module.<eks-module-name>.module.eks-cluster[0].module.eks-cluster.kubernetes_config_map_v1_data.aws_auth[0]" "module.<eks-module-name>.module.eks-cluster[0].module.aws_auth_config_map.kubernetes_config_map_v1_data.aws_auth[0]" terraform state mv "module.<eks-module-name>.module.eks-cluster[0].module.eks-cluster.aws_security_group_rule.node[\"ingress_cluster_9443\"]" "module.<eks-module-name>.module.eks-cluster[0].module.eks-cluster.aws_security_group_rule.node[\"ingress_cluster_9443_webhook\"]" terraform state mv "module.<eks-module-name>.module.eks-cluster[0].module.eks-cluster.aws_security_group_rule.node[\"ingress_cluster_8443\"]" "module.<eks-module-name>.module.eks-cluster[0].module.eks-cluster.aws_security_group_rule.node[\"ingress_cluster_8443_webhook\"]" # command to run in case upgrading from <2.14.6 version, run before applying the `terraform apply` for new version terraform state rm "module.<eks-module-name>.module.autoscaler[0].aws_iam_policy.policy" # command to run when apply fails to create the existing resource "<eks-cluster-name>:arn:aws:iam::<aws-account-id>:role/aws-reserved/sso.amazonaws.com/eu-central-1/AWSReservedSSO_AdministratorAccess_<some-hash>" terraform import "module.<eks-module-name>.module.eks-cluster[0].module.eks-cluster.aws_eks_access_entry.this[\"cluster_creator\"]" "<eks-cluster-name>:arn:aws:iam::<aws-account-id>:role/aws-reserved/sso.amazonaws.com/eu-central-1/AWSReservedSSO_AdministratorAccess_<some-hash>" # command to apply when secret store fails to be linked, probably there will be need to remove the resource terraform import "module.secret_store.kubectl_manifest.main" external-secrets.io/v1beta1//SecretStore//app-test//default
- from <2.20.0 to >=2.20.0 version
- in case if karpenter is enabled.
the karpenter chart have been upgraded and CRDs creation have been moved into separate chart and there is need to run following kubectl commands before applying module update:
kubectl patch crd ec2nodeclasses.karpenter.k8s.aws -p '{"metadata":{"labels":{"app.kubernetes.io/managed-by":"Helm"},"annotations":{"meta.helm.sh/release-name":"karpenter-crd","meta.helm.sh/release-namespace":"karpenter"}}}' kubectl patch crd nodeclaims.karpenter.sh -p '{"metadata":{"labels":{"app.kubernetes.io/managed-by":"Helm"},"annotations":{"meta.helm.sh/release-name":"karpenter-crd","meta.helm.sh/release-namespace":"karpenter"}}}' kubectl patch crd nodepools.karpenter.sh -p '{"metadata":{"labels":{"app.kubernetes.io/managed-by":"Helm"},"annotations":{"meta.helm.sh/release-name":"karpenter-crd","meta.helm.sh/release-namespace":"karpenter"}}}'
- the alb ingress/load-balancer controller variables have been moved under one variable set
alb_load_balancer_controller
so you have to change old way passed config(if you have this variables manually passed), here is the moved ones:enable_alb_ingress_controller
,enable_waf_for_alb
,alb_log_bucket_name
,alb_log_bucket_path
,send_alb_logs_to_cloudwatch
- in case if karpenter is enabled.
the karpenter chart have been upgraded and CRDs creation have been moved into separate chart and there is need to run following kubectl commands before applying module update:
data "aws_availability_zones" "available" {}
locals {
cluster_endpoint_public_access = true
cluster_enabled_log_types = ["audit"]
vpc = {
create = {
name = "dev"
availability_zones = data.aws_availability_zones.available.names
private_subnets = ["172.16.1.0/24", "172.16.2.0/24", "172.16.3.0/24"]
public_subnets = ["172.16.4.0/24", "172.16.5.0/24", "172.16.6.0/24"]
cidr = "172.16.0.0/16"
public_subnet_tags = {
"kubernetes.io/cluster/dev" = "shared"
"kubernetes.io/role/elb" = "1"
}
private_subnet_tags = {
"kubernetes.io/cluster/dev" = "shared"
"kubernetes.io/role/internal-elb" = "1"
}
}
}
cluster_name = "your-cluster-name-goes-here"
alb_log_bucket_name = "your-log-bucket-name-goes-here"
fluent_bit_name = "fluent-bit"
log_group_name = "fluent-bit-cloudwatch-env"
}
#(Basic usage with example of using already created VPC)
data "aws_availability_zones" "available" {}
locals {
cluster_endpoint_public_access = true
cluster_enabled_log_types = ["audit"]
vpc = {
link = {
id = "vpc-1234"
private_subnet_ids = ["subnet-1", "subnet-2"]
}
}
cluster_name = "your-cluster-name-goes-here"
alb_log_bucket_name = "your-log-bucket-name-goes-here"
fluent_bit_name = "fluent-bit"
log_group_name = "fluent-bit-cloudwatch-env"
}
# Minimum
module "cluster_min" {
source = "dasmeta/eks/aws"
version = "0.1.1"
cluster_name = local.cluster_name
users = local.users
vpc = {
link = {
id = "vpc-1234"
private_subnet_ids = ["subnet-1", "subnet-2"]
}
}
}
# Max @TODO: the max param passing setup needs to be checked/fixed
module "cluster_max" {
source = "dasmeta/eks/aws"
version = "0.1.1"
### VPC
vpc = {
create = {
name = "dev"
availability_zones = data.aws_availability_zones.available.names
private_subnets = ["172.16.1.0/24", "172.16.2.0/24", "172.16.3.0/24"]
public_subnets = ["172.16.4.0/24", "172.16.5.0/24", "172.16.6.0/24"]
cidr = "172.16.0.0/16"
public_subnet_tags = {
"kubernetes.io/cluster/dev" = "shared"
"kubernetes.io/role/elb" = "1"
}
private_subnet_tags = {
"kubernetes.io/cluster/dev" = "shared"
"kubernetes.io/role/internal-elb" = "1"
}
}
}
cluster_enabled_log_types = local.cluster_enabled_log_types
cluster_endpoint_public_access = local.cluster_endpoint_public_access
### EKS
cluster_name = local.cluster_name
manage_aws_auth = true
# IAM users username and group. By default value is ["system:masters"]
user = [
{
username = "devops1"
group = ["system:masters"]
},
{
username = "devops2"
group = ["system:kube-scheduler"]
},
{
username = "devops3"
}
]
# You can create node use node_group when you create node in specific subnet zone.(Note. This Case Ec2 Instance havn't specific name).
# Other case you can use worker_group variable.
node_groups = {
example = {
name = "nodegroup"
name-prefix = "nodegroup"
additional_tags = {
"Name" = "node"
"ExtraTag" = "ExtraTag"
}
instance_type = "t3.xlarge"
max_capacity = 1
disk_size = 50
create_launch_template = false
subnet = ["subnet_id"]
}
}
node_groups_default = {
disk_size = 50
instance_types = ["t3.medium"]
}
worker_groups = {
default = {
name = "nodes"
instance_type = "t3.xlarge"
asg_max_size = 3
root_volume_size = 50
}
}
workers_group_defaults = {
launch_template_use_name_prefix = true
launch_template_name = "default"
root_volume_type = "gp2"
root_volume_size = 50
}
### ALB-INGRESS-CONTROLLER
alb_log_bucket_name = local.alb_log_bucket_name
### FLUENT-BIT
fluent_bit_name = local.fluent_bit_name
log_group_name = local.log_group_name
# Should be refactored to install from cluster: for prod it has done from metrics-server.tf
### METRICS-SERVER
# enable_metrics_server = false
metrics_server_name = "metrics-server"
}
- if vpc have been created externally(not inside this module) then you may need to set the following tags on private subnets karpenter.sh/discovery=<cluster-name>
- then enabling karpenter on existing old cluster there is possibility to see cycle-dependency error, to overcome this you need at first to apply main eks module change (terraform apply --target "module.<eks-module-name>.module.eks-cluster"
) and then rest of cluster-autoloader destroy and karpenter install ones
- when destroying cluster which have karpenter enabled there is possibility of failure on karpenter resource removal, you need to run destruction one more time to get it complete
- in order to be able to use spot instances you may need to create AWSServiceRoleForEC2Spot IAM role on aws account(TODO: check and create this role on account module automatically), here is the doc: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/service-linked-roles-spot-instance-requests.html , otherwise karpenter created nodeclaim
kubernetes resource will show AuthFailure.ServiceLinkedRoleCreationNotPermitted error
module "eks" {
source = "dasmeta/eks/aws"
version = "3.x.x"
.....
karpenter = {
enabled = true
configs = {
replicas = 1
}
resource_configs_defaults = { # this is optional param, look into karpenter submodule to get available defaults
limits = {
cpu = 11 # the default is 10 and we can add limit restrictions on memory also
}
}
resource_configs = {
nodePools = {
general = { weight = 1 } # by default it use linux amd64 cpu<6, memory<10000Mi, >2 generation and ["spot", "on-demand"] type nodes so that it tries to get spot at first and if no then on-demand
}
}
}
.....
}
**/
Name | Version |
---|---|
terraform | ~> 1.3 |
aws | >= 3.31, < 6.0.0 |
helm | >= 2.4.1 |
kubectl | ~>1.14 |
Name | Version |
---|---|
aws | >= 3.31, < 6.0.0 |
helm | >= 2.4.1 |
kubernetes | n/a |
Name | Source | Version |
---|---|---|
adot | ./modules/adot | n/a |
alb-ingress-controller | ./modules/aws-load-balancer-controller | n/a |
api-gw-controller | ./modules/api-gw | n/a |
autoscaler | ./modules/autoscaler | n/a |
cloudwatch-metrics | ./modules/cloudwatch-metrics | n/a |
cw_alerts | dasmeta/monitoring/aws//modules/alerts | 1.3.5 |
ebs-csi | ./modules/ebs-csi | n/a |
efs-csi-driver | ./modules/efs-csi | n/a |
eks-cluster | ./modules/eks | n/a |
external-dns | ./modules/external-dns | n/a |
external-secrets | ./modules/external-secrets | n/a |
flagger | ./modules/flagger | n/a |
fluent-bit | ./modules/fluent-bit | n/a |
karpenter | ./modules/karpenter | n/a |
metrics-server | ./modules/metrics-server | n/a |
nginx-ingress-controller | ./modules/nginx-ingress-controller/ | n/a |
node-problem-detector | ./modules/node-problem-detector | n/a |
olm | ./modules/olm | n/a |
portainer | ./modules/portainer | n/a |
priority_class | ./modules/priority-class/ | n/a |
sso-rbac | ./modules/sso-rbac | n/a |
vpc | dasmeta/vpc/aws | 1.0.1 |
weave-scope | ./modules/weave-scope | n/a |
Name | Type |
---|---|
helm_release.cert-manager | resource |
helm_release.kube-state-metrics | resource |
kubernetes_namespace.meta-system | resource |
aws_caller_identity.current | data source |
aws_region.current | data source |
Name | Description | Type | Default | Required |
---|---|---|---|---|
account_id | AWS Account Id to apply changes into | string |
null |
no |
additional_priority_classes | Defines Priority Classes in Kubernetes, used to assign different levels of priority to pods. By default, this module creates three Priority Classes: 'high'(1000000), 'medium'(500000) and 'low'(250000) . You can also provide a custom list of Priority Classes if needed. | list(object({ |
[] |
no |
adot_config | accept_namespace_regex defines the list of namespaces from which metrics will be exported, and additional_metrics defines additional metrics to export. | object({ |
kube-system)") additional_metrics = optional(list(string), []) log_group_name = optional(string, "adot") log_retention = optional(number, 14) helm_values = optional(any, null) logging_enable = optional(bool, false) resources = optional(object({ limit = object({ cpu = optional(string, "200m") memory = optional(string, "200Mi") }) requests = object({ cpu = optional(string, "200m") memory = optional(string, "200Mi") }) }), { limit = { cpu = "200m" memory = "200Mi" } requests = { cpu = "200m" memory = "200Mi" } }) }) |
{ |
adot_version | The version of the AWS Distro for OpenTelemetry addon to use. If not passed it will get compatible version based on cluster_version | string |
null |
no |
alarms | Alarms enabled by default you need set sns topic name for send alarms for customize alarms threshold use custom_values | object({ |
n/a | yes |
alb_load_balancer_controller | Aws alb ingress/load-balancer controller configs. | object({ |
{} |
no |
api_gateway_resources | Nested map containing API, Stage, and VPC Link resources | list(object({ |
[] |
no |
api_gw_deploy_region | Region in which API gatewat will be configured | string |
"" |
no |
autoscaler_image_patch | The patch number of autoscaler image | number |
0 |
no |
autoscaler_limits | n/a | object({ |
{ |
no |
autoscaler_requests | n/a | object({ |
{ |
no |
autoscaling | Weather enable cluster autoscaler for EKS, in case if karpenter enabled this config will be ignored and the cluster autoscaler will be considered as disabled | bool |
true |
no |
bindings | Variable which describes group and role binding | list(object({ |
[] |
no |
cert_manager_chart_version | The cert-manager helm chart version. | string |
"1.16.2" |
no |
cluster_addons | Cluster addon configurations to enable. | any |
{} |
no |
cluster_enabled_log_types | A list of the desired control plane logs to enable. For more information, see Amazon EKS Control Plane Logging documentation (https://docs.aws.amazon.com/eks/latest/userguide/control-plane-logs.html) | list(string) |
[] |
no |
cluster_endpoint_public_access | n/a | bool |
true |
no |
cluster_name | Creating eks cluster name. | string |
n/a | yes |
cluster_version | Allows to set/change kubernetes cluster version, kubernetes version needs to be updated at leas once a year. Please check here for available versions https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html | string |
"1.29" |
no |
create | Whether to create cluster and other resources or not | bool |
true |
no |
create_cert_manager | If enabled it always gets deployed to the cert-manager namespace. | bool |
false |
no |
ebs_csi_version | EBS CSI driver addon version, by default it will pick right version for this driver based on cluster_version | string |
null |
no |
efs_id | EFS filesystem id in AWS | string |
null |
no |
efs_storage_classes | Additional storage class configurations: by default, 2 storage classes are created - efs-sc and efs-sc-root which has 0 uid. One can add another storage classes besides these 2. | list(object({ |
[] |
no |
enable_api_gw_controller | Weather enable API-GW controller or not | bool |
false |
no |
enable_ebs_driver | Weather enable EBS-CSI driver or not | bool |
true |
no |
enable_efs_driver | Weather install EFS driver or not in EKS | bool |
false |
no |
enable_external_secrets | Whether to enable external-secrets operator | bool |
true |
no |
enable_kube_state_metrics | Enable kube-state-metrics | bool |
false |
no |
enable_metrics_server | METRICS-SERVER | bool |
false |
no |
enable_node_problem_detector | n/a | bool |
true |
no |
enable_olm | To install OLM controller (experimental). | bool |
false |
no |
enable_portainer | Enable Portainer provisioning or not | bool |
false |
no |
enable_sso_rbac | Enable SSO RBAC integration or not | bool |
false |
no |
external_dns | Allows to install external-dns helm chart and related roles, which allows to automatically create R53 records based on ingress/service domain/host configs | object({ |
{ |
no |
external_secrets_namespace | The namespace of external-secret operator | string |
"kube-system" |
no |
flagger | Allows to create/deploy flagger operator to have custom rollout strategies like canary/blue-green and also it allows to create custom flagger metric templates | object({ |
{ |
no |
fluent_bit_configs | Fluent Bit configs | object({ |
{ |
no |
karpenter | Allows to create/deploy/configure karpenter operator and its resources to have custom node auto-calling | object({ |
{ |
no |
kube_state_metrics_chart_version | The kube-state-metrics chart version | string |
"5.27.0" |
no |
manage_aws_auth | n/a | bool |
true |
no |
map_roles | Additional IAM roles to add to the aws-auth configmap. | list(object({ |
[] |
no |
metrics_exporter | Metrics Exporter, can use cloudwatch or adot | string |
"adot" |
no |
metrics_server_name | n/a | string |
"metrics-server" |
no |
nginx_ingress_controller_config | Nginx ingress controller configs | object({ |
{ |
no |
node_groups | Map of EKS managed node group definitions to create | any |
{ |
no |
node_groups_default | Map of EKS managed node group default configurations | any |
{ |
no |
node_security_group_additional_rules | n/a | any |
{ |
no |
portainer_config | Portainer hostname and ingress config. | object({ |
{} |
no |
prometheus_metrics | Prometheus Metrics | any |
[] |
no |
region | AWS Region name. | string |
null |
no |
roles | Variable describes which role will user have K8s | list(object({ |
[] |
no |
scale_down_unneeded_time | Scale down unneeded in minutes | number |
2 |
no |
tags | Extra tags to attach to eks cluster. | any |
{} |
no |
users | List of users to open eks cluster api access | list(any) |
[] |
no |
vpc | VPC configuration for eks, we support both cases create new vpc(create field) and using already created one(link) | object({ |
n/a | yes |
weave_scope_config | Weave scope namespace configuration variables | object({ |
{ |
no |
weave_scope_enabled | Weather enable Weave Scope or not | bool |
false |
no |
worker_groups | Worker groups. | any |
{} |
no |
workers_group_defaults | Worker group defaults. | any |
{ |
no |
Name | Description |
---|---|
cluster_certificate | EKS cluster certificate used for authentication/access in helm/kubectl/kubernetes providers |
cluster_host | EKS cluster host name used for authentication/access in helm/kubectl/kubernetes providers |
cluster_iam_role_name | n/a |
cluster_id | n/a |
cluster_primary_security_group_id | n/a |
cluster_security_group_id | n/a |
cluster_token | EKS cluster token used for authentication/access in helm/kubectl/kubernetes providers |
eks_auth_configmap | n/a |
eks_module | n/a |
eks_oidc_root_ca_thumbprint | Grab eks_oidc_root_ca_thumbprint from oidc_provider_arn. |
map_user_data | n/a |
oidc_provider_arn | ## CLUSTER |
role_arns | n/a |
role_arns_without_path | n/a |
vpc_cidr_block | The cidr block of the vpc |
vpc_default_security_group_id | The ID of default security group created for vpc |
vpc_id | The newly created vpc id |
vpc_nat_public_ips | The list of elastic public IPs for vpc |
vpc_private_subnets | The newly created vpc private subnets IDs list |
vpc_public_subnets | The newly created vpc public subnets IDs list |