Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove of autoscaling IAM policy related stuff #716

Merged
merged 10 commits into from
Feb 4, 2020
10 changes: 8 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,14 @@ project adheres to [Semantic Versioning](http://semver.org/).

## [[v8.?.?](https://github.com/terraform-aws-modules/terraform-aws-eks/compare/v8.2.0...HEAD)] - 2020-xx-xx]

- Write your awesome change here (by @you)
- [CI] Switch `Validate` github action to use env vars (by @max-rocket-internet)
- [CI] Bump pre-commit-terraform version (by @barryib)
- Added example `examples/irsa` for IAM Roles for Service Accounts (by @max-rocket-internet)
- **Breaking:** Removal of autoscaling IAM policy and tags (by @max-rocket-internet)

#### Important notes

Autoscaling policy and tags have been removed from this module. This reduces complexity and increases security as the policy was attached to the node group IAM role. To manage it outside of this module either follow the example in `examples/irsa` to attach an IAM role to the cluster-autoscaler `serviceAccount` or create the policy outside this module and pass it in using the `workers_additional_policies` variable.

# History

Expand All @@ -20,7 +25,8 @@ project adheres to [Semantic Versioning](http://semver.org/).
- Include ability to configure custom os-specific command for waiting until kube cluster is healthy (@sanjeevgiri)
- Disable creation of ingress rules if worker nodes security groups are exists (@andjelx)
- [CI] Update pre-commit and re-generate docs to work with terraform-docs >= 0.8.1 (@barryib)
- Added example `examples/irsa` for IAM Roles for Service Accounts (by @max-rocket-internet)

# History

## [[v8.1.0](https://github.com/terraform-aws-modules/terraform-aws-eks/compare/v8.0.0...v8.1.0)] - 2020-01-17]

Expand Down
10 changes: 3 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,14 +161,13 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:-----:|
| attach\_worker\_autoscaling\_policy | Whether to attach the module managed cluster autoscaling iam policy to the default worker IAM role. This requires `manage_worker_autoscaling_policy = true` | `bool` | `true` | no |
| attach\_worker\_cni\_policy | Whether to attach the Amazon managed `AmazonEKS_CNI_Policy` IAM policy to the default worker IAM role. WARNING: If set `false` the permissions must be assigned to the `aws-node` DaemonSet pods via another method or nodes will not be able to join the cluster. | `bool` | `true` | no |
| cluster\_create\_timeout | Timeout value when creating the EKS cluster. | `string` | `"15m"` | no |
| cluster\_delete\_timeout | Timeout value when deleting the EKS cluster. | `string` | `"15m"` | no |
| cluster\_enabled\_log\_types | A list of the desired control plane logging to enable. For more information, see Amazon EKS Control Plane Logging documentation (https://docs.aws.amazon.com/eks/latest/userguide/control-plane-logs.html) | `list(string)` | `[]` | no |
| cluster\_endpoint\_private\_access | Indicates whether or not the Amazon EKS private API server endpoint is enabled. | `bool` | `false` | no |
| cluster\_endpoint\_public\_access | Indicates whether or not the Amazon EKS public API server endpoint is enabled. | `bool` | `true` | no |
| cluster\_endpoint\_public\_access\_cidrs | List of CIDR blocks which can access the Amazon EKS public API server endpoint. | `list(string)` | <pre>[<br> "0.0.0.0/0"<br>]<br></pre> | no |
| cluster\_endpoint\_public\_access\_cidrs | List of CIDR blocks which can access the Amazon EKS public API server endpoint. | `list(string)` | <pre>[<br> "0.0.0.0/0"<br>]</pre> | no |
| cluster\_iam\_role\_name | IAM role name for the cluster. Only applicable if manage\_cluster\_iam\_resources is set to false. | `string` | `""` | no |
| cluster\_log\_kms\_key\_id | If a KMS Key ARN is set, this key will be used to encrypt the corresponding log group. Please be sure that the KMS Key has an appropriate key policy (https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/encrypt-log-data-kms.html) | `string` | `""` | no |
| cluster\_log\_retention\_in\_days | Number of days to retain log events. Default retention - 90 days. | `number` | `90` | no |
Expand All @@ -187,11 +186,10 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a
| kubeconfig\_name | Override the default name used for items kubeconfig. | `string` | `""` | no |
| manage\_aws\_auth | Whether to apply the aws-auth configmap file. | `bool` | `true` | no |
| manage\_cluster\_iam\_resources | Whether to let the module manage cluster IAM resources. If set to false, cluster\_iam\_role\_name must be specified. | `bool` | `true` | no |
| manage\_worker\_autoscaling\_policy | Whether to let the module manage the cluster autoscaling iam policy. | `bool` | `true` | no |
| manage\_worker\_iam\_resources | Whether to let the module manage worker IAM resources. If set to false, iam\_instance\_profile\_name must be specified for workers. | `bool` | `true` | no |
| map\_accounts | Additional AWS account numbers to add to the aws-auth configmap. See examples/basic/variables.tf for example format. | `list(string)` | `[]` | no |
| map\_roles | Additional IAM roles to add to the aws-auth configmap. See examples/basic/variables.tf for example format. | <pre>list(object({<br> rolearn = string<br> username = string<br> groups = list(string)<br> }))<br></pre> | `[]` | no |
| map\_users | Additional IAM users to add to the aws-auth configmap. See examples/basic/variables.tf for example format. | <pre>list(object({<br> userarn = string<br> username = string<br> groups = list(string)<br> }))<br></pre> | `[]` | no |
| map\_roles | Additional IAM roles to add to the aws-auth configmap. See examples/basic/variables.tf for example format. | <pre>list(object({<br> rolearn = string<br> username = string<br> groups = list(string)<br> }))</pre> | `[]` | no |
| map\_users | Additional IAM users to add to the aws-auth configmap. See examples/basic/variables.tf for example format. | <pre>list(object({<br> userarn = string<br> username = string<br> groups = list(string)<br> }))</pre> | `[]` | no |
| node\_groups | Map of map of node groups to create. See `node_groups` module's documentation for more details | `any` | `{}` | no |
| node\_groups\_defaults | Map of values to be applied to all node groups. See `node_groups` module's documentaton for more details | `any` | `{}` | no |
| permissions\_boundary | If provided, all IAM roles will be created with this permissions boundary attached. | `string` | n/a | yes |
Expand Down Expand Up @@ -233,8 +231,6 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a
| kubeconfig\_filename | The filename of the generated kubectl config. |
| node\_groups | Outputs from EKS node groups. Map of maps, keyed by var.node\_groups keys |
| oidc\_provider\_arn | The ARN of the OIDC Provider if `enable_irsa = true`. |
| worker\_autoscaling\_policy\_arn | ARN of the worker autoscaling IAM policy if `manage_worker_autoscaling_policy = true` |
| worker\_autoscaling\_policy\_name | Name of the worker autoscaling IAM policy if `manage_worker_autoscaling_policy = true` |
| worker\_iam\_instance\_profile\_arns | default IAM instance profile ARN for EKS worker groups |
| worker\_iam\_instance\_profile\_names | default IAM instance profile name for EKS worker groups |
| worker\_iam\_role\_arn | default IAM role ARN for EKS worker groups |
Expand Down
76 changes: 66 additions & 10 deletions docs/autoscaling.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,72 @@
# Autoscaling

Autoscaling of worker nodes can be easily enabled by setting the `autoscaling_enabled` variable to `true` for a worker group in the `worker_groups` map.
This will add the required tags to the autoscaling group for the [cluster-autoscaler](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler).
One should also set `protect_from_scale_in` to `true` for such worker groups, to ensure that cluster-autoscaler is solely responsible for scaling events.
To enable worker node autoscaling you will need to do a few things:

You will also need to install the cluster-autoscaler into your cluster. The easiest way to do this is with [helm](https://helm.sh/).
- Add the [required tags](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider/aws#auto-discovery-setup) to the worker group
- Install the cluster-autoscaler
- Give the cluster-autoscaler access via an IAM policy

The [helm chart](https://github.com/helm/charts/tree/master/stable/cluster-autoscaler) for the cluster-autoscaler requires some specific settings to work in an EKS cluster. These settings are supplied via YAML values file when installing the helm chart. Here is an example values file:
It's probably easiest to follow the example in [examples/irsa](../examples/irsa), this will install the cluster-autoscaler using [Helm](https://helm.sh/) and use IRSA to attach a policy.

If you don't want to use IRSA then you will need to attach the IAM policy to the worker node IAM role or add AWS credentials to the cluster-autoscaler environment variables. Here is some example terraform code for the policy:

```hcl
resource "aws_iam_role_policy_attachment" "workers_autoscaling" {
policy_arn = aws_iam_policy.worker_autoscaling.arn
role = module.my_cluster.worker_iam_role_name[0]
}

resource "aws_iam_policy" "worker_autoscaling" {
name_prefix = "eks-worker-autoscaling-${module.my_cluster.cluster_id}"
description = "EKS worker node autoscaling policy for cluster ${module.my_cluster.cluster_id}"
policy = data.aws_iam_policy_document.worker_autoscaling.json
path = var.iam_path
}

data "aws_iam_policy_document" "worker_autoscaling" {
statement {
sid = "eksWorkerAutoscalingAll"
effect = "Allow"

actions = [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeTags",
"ec2:DescribeLaunchTemplateVersions",
]

resources = ["*"]
}

statement {
sid = "eksWorkerAutoscalingOwn"
effect = "Allow"

actions = [
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup",
"autoscaling:UpdateAutoScalingGroup",
]

resources = ["*"]

condition {
test = "StringEquals"
variable = "autoscaling:ResourceTag/kubernetes.io/cluster/${module.my_cluster.cluster_id}"
values = ["owned"]
}

condition {
test = "StringEquals"
variable = "autoscaling:ResourceTag/k8s.io/cluster-autoscaler/enabled"
values = ["true"]
}
}
}
```

And example values for the [helm chart](https://github.com/helm/charts/tree/master/stable/cluster-autoscaler):

```yaml
rbac:
Expand All @@ -26,10 +86,6 @@ To install the chart, simply run helm with the `--values` option:
helm install stable/cluster-autoscaler --values=path/to/your/values-file.yaml
```

`NOTE`
## Notes

There is a variable `asg_desired_capacity` given in the `local.tf` file, currently it can be used to change the desired worker(s) capacity in the autoscaling group but currently it is being ignored in terraform to reduce the [complexities](https://github.com/terraform-aws-modules/terraform-aws-eks/issues/510#issuecomment-531700442) and the feature of scaling up and down the cluster nodes is being handled by the cluster autoscaler.

## See More

[Using AutoScalingGroup MixedInstancesPolicy](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md#using-autoscalinggroup-mixedinstancespolicy)
2 changes: 1 addition & 1 deletion docs/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ The safest and easiest option is to set `asg_min_size` and `asg_max_size` to 0 o

The module is configured to ignore this value. Unfortunately Terraform does not support variables within the `lifecycle` block.

The setting is ignored to allow the cluster autoscaler to work correctly and so that terraform applys do not accidentally remove running workers.
The setting is ignored to allow the cluster autoscaler to work correctly and so that terraform apply does not accidentally remove running workers.

You can change the desired count via the CLI or console if you're not using the cluster autoscaler.

Expand Down
4 changes: 0 additions & 4 deletions docs/spot-instances.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@ Example worker group configuration that uses an ASG with launch configuration fo
name = "on-demand-1"
instance_type = "m4.xlarge"
asg_max_size = 1
autoscaling_enabled = true
kubelet_extra_args = "--node-labels=kubernetes.io/lifecycle=normal"
suspended_processes = ["AZRebalance"]
},
Expand All @@ -41,7 +40,6 @@ Example worker group configuration that uses an ASG with launch configuration fo
spot_price = "0.199"
instance_type = "c4.xlarge"
asg_max_size = 20
autoscaling_enabled = true
kubelet_extra_args = "--node-labels=kubernetes.io/lifecycle=spot"
suspended_processes = ["AZRebalance"]
},
Expand All @@ -50,7 +48,6 @@ Example worker group configuration that uses an ASG with launch configuration fo
spot_price = "0.20"
instance_type = "m4.xlarge"
asg_max_size = 20
autoscaling_enabled = true
kubelet_extra_args = "--node-labels=kubernetes.io/lifecycle=spot"
suspended_processes = ["AZRebalance"]
}
Expand All @@ -67,7 +64,6 @@ Launch Template support is a recent addition to both AWS and this module. It mig
name = "on-demand-1"
instance_type = "m4.xlarge"
asg_max_size = 10
autoscaling_enabled = true
kubelet_extra_args = "--node-labels=spot=false"
suspended_processes = ["AZRebalance"]
}
Expand Down
5 changes: 5 additions & 0 deletions examples/irsa/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,11 @@ module "eks" {
"key" = "k8s.io/cluster-autoscaler/enabled"
"propagate_at_launch" = "false"
"value" = "true"
},
{
"key" = "k8s.io/cluster-autoscaler/${local.cluster_name}"
"propagate_at_launch" = "false"
"value" = "true"
}
]
}
Expand Down
1 change: 0 additions & 1 deletion local.tf
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,6 @@ locals {
public_ip = false # Associate a public ip address with a worker
kubelet_extra_args = "" # This string is passed directly to kubelet if set. Useful for adding labels or taints.
subnets = var.subnets # A list of subnets to place the worker nodes in. i.e. ["subnet-123", "subnet-456", "subnet-789"]
autoscaling_enabled = false # Sets whether policy and matching tags will be added to allow autoscaling.
additional_security_group_ids = [] # A list of additional security group ids to include in worker launch config
protect_from_scale_in = false # Prevent AWS from scaling in, so that cluster-autoscaler is solely responsible.
iam_instance_profile_name = "" # A custom IAM instance profile name. Used when manage_worker_iam_resources is set to false. Incompatible with iam_role_id.
Expand Down
9 changes: 4 additions & 5 deletions node_groups.tf
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,10 @@ data "null_data_source" "node_groups" {
# Ensure these resources are created before "unlocking" the data source.
# `depends_on` causes a refresh on every run so is useless here.
# [Re]creating or removing these resources will trigger recreation of Node Group resources
aws_auth = coalescelist(kubernetes_config_map.aws_auth[*].id, [""])[0]
role_NodePolicy = coalescelist(aws_iam_role_policy_attachment.workers_AmazonEKSWorkerNodePolicy[*].id, [""])[0]
role_CNI_Policy = coalescelist(aws_iam_role_policy_attachment.workers_AmazonEKS_CNI_Policy[*].id, [""])[0]
role_Container = coalescelist(aws_iam_role_policy_attachment.workers_AmazonEC2ContainerRegistryReadOnly[*].id, [""])[0]
role_autoscaling = coalescelist(aws_iam_role_policy_attachment.workers_autoscaling[*].id, [""])[0]
aws_auth = coalescelist(kubernetes_config_map.aws_auth[*].id, [""])[0]
role_NodePolicy = coalescelist(aws_iam_role_policy_attachment.workers_AmazonEKSWorkerNodePolicy[*].id, [""])[0]
role_CNI_Policy = coalescelist(aws_iam_role_policy_attachment.workers_AmazonEKS_CNI_Policy[*].id, [""])[0]
role_Container = coalescelist(aws_iam_role_policy_attachment.workers_AmazonEC2ContainerRegistryReadOnly[*].id, [""])[0]
}
}

Expand Down
10 changes: 0 additions & 10 deletions outputs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -153,16 +153,6 @@ output "worker_iam_role_arn" {
)[0]
}

output "worker_autoscaling_policy_name" {
description = "Name of the worker autoscaling IAM policy if `manage_worker_autoscaling_policy = true`"
value = concat(aws_iam_policy.worker_autoscaling[*].name, [""])[0]
}

output "worker_autoscaling_policy_arn" {
description = "ARN of the worker autoscaling IAM policy if `manage_worker_autoscaling_policy = true`"
value = concat(aws_iam_policy.worker_autoscaling[*].arn, [""])[0]
}

output "node_groups" {
description = "Outputs from EKS node groups. Map of maps, keyed by var.node_groups keys"
value = module.node_groups.node_groups
Expand Down
12 changes: 0 additions & 12 deletions variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -264,18 +264,6 @@ variable "workers_role_name" {
default = ""
}

variable "manage_worker_autoscaling_policy" {
description = "Whether to let the module manage the cluster autoscaling iam policy."
type = bool
default = true
}

variable "attach_worker_autoscaling_policy" {
description = "Whether to attach the module managed cluster autoscaling iam policy to the default worker IAM role. This requires `manage_worker_autoscaling_policy = true`"
type = bool
default = true
}

variable "attach_worker_cni_policy" {
description = "Whether to attach the Amazon managed `AmazonEKS_CNI_Policy` IAM policy to the default worker IAM role. WARNING: If set `false` the permissions must be assigned to the `aws-node` DaemonSet pods via another method or nodes will not be able to join the cluster."
type = bool
Expand Down
Loading