This project aims to create an Elastic Kubernetes Cluster (EKS) in AWS using Infrastructure as Code with Terraform. It contains two modules, one for the main network resources and one for the EKS and its dependencies, such as roles, security groups, NACLs and IAM roles. This AWS setup follows the AWS Well-Architected Framework to ensure that the infrastructure is secure, reliable, efficient, and cost-effective.
The AWS Well-Architected Framework is a set of best practices and guidelines for designing and operating reliable, secure, efficient, and cost-effective systems in the cloud. It provides a consistent approach for customers and partners to evaluate architectures, and provides guidance to help implement designs that will scale with your application needs over time.
To use this project, you will need to have the following prerequisites:
- Terraform v1.62 ( version )
- AWS CLI v2
- Trivy v0.46 or higher
- Optionally install Trivy VS Code Plugin
- S3 bucket for Terraform state file
- Dynamo Table for Terraform state file locking
Read the following section about Terraform State File Locking with S3 and DynamoDB to learn how to create them.
Before you can deploy the services, you will need to authenticate with AWS. The easiest way is configuring your AWS credentials using the AWS CLI with aws configure
. For more information, see the AWS CLI documentation.
Further methods of authentication can be found at Authentication and Configuration. For example, you can provide environment variables or a shared credentials file path when running automated pipelines.
Furthermore, you can uncomment the aws provider section in the backend.tf file and provide values for the variables either directly or as environment variables. Notice that, in this case, you'll need to generate a session token if you are using MFA.
To prevent conflicts when working in a team, we are using the Terraform S3 backend to store the state file remotely in an Amazon S3 bucket with versioning enabled. Read about Terraform state file at the state file documentation.
We are also using a DynamoDB table for locking to ensure that only one person can make changes to a resource at a time, preventing conflicts and ensuring that changes are applied in the correct order.
To create them, run the script create-backend.sh
in the scripts
folder. You will need to change the values of the variables in the script to match your environment, or declare them as environment variables.
Read more about state file locking with S3 at the S3 backend documentation.
This project uses external modules from my collection of modules in the terraform-aws repository by declaring the repository as a Terraform Module Source and specifying the module subdirectory path and version inside of the repository, as follows:
module "vpc" {
source = "github.com/guirgouveia/terraform-modules//vpc?ref=v1.0.0"
}
The cluster spans two Availability Zones and uses two subnets, one in each Availability Zone. The subnets are private and are not exposed to the internet. The EKS cluster is deployed in the private subnets and the endpoint is only exposed internally.
Hence, the EKS cluster is not exposed to the internet. Instead, we are using a NAT Gateway to allow the EKS cluster to access the internet for updates and other dependencies.
This project includes default values for all variables, so no variable input is necessary to follow this exact example.
However, you can change the values of the variables to customize the resources name, instance types, etc. using a terraform.tfvars file, enviroment variables or CLI. Read more about variables here.
To get started, having all the prerequisites set, run the following commands:
terraform init
terraform plan
terraform apply
To destroy the resources, run the following command:
terraform destroy
We're utilizing public and private subnets to ensure that our EKS cluster is not exposed to the internet. The EKS cluster is deployed in the private subnets and the endpoint is only exposed internally, while the public subnets are used for the NAT Gateway and the Internet Gateway.
In our AWS setup, Network Access Control Lists (NACLs) are configured to bolster security across all subnets in every Availability Zone (AZ), including private subnets, while Security Groups control the traffic to the EKS cluster
See the VPC module documentation and the Network section of the EKS module for more information.
The EKS cluster is deployed in the private subnets and the endpoint is only exposed internally. The EKS cluster is not exposed to the internet. Instead, we are using a NAT Gateway to allow the EKS cluster to access the internet for updates and other dependencies.
See the EKS module documentation for more information.
This project implements some DevSecOps best practices such as Static Code Analysis with Trivy, prior tfsec from aquasec, to find vulnerabilities and misconfigurations on the Terraform code.
Aquasec has an interesting blog post talking about the nuances between DevOps and DevSecOps, and how to implement DevSecOps in your organization.
For more information about Trivy, check the Trivy Documentation I created for this project summarizing it as well as showing some use cases and options.
Read the automatically generated documentation for inputs and outputs description and usage.
This documentation is generated by the CI pipeline with terraform-docs on every PR merge to the main branch.
- Keep the Terraform code DRY with Terragrunt.
- Use external git repository for the Terraform modules.
- Generate documentation with terraform-docs
- Create IaC tests with native Terraform test framework or Terratest.
- Configure RBAC for the EKS cluster.
- Configure Fargate for the EKS cluster.
- Configure AWS Load Balancer Controller for the EKS cluster.
- Deploy and expose a sample application to the EKS cluster using Terraform's official Kubernetes Provider.
If you wish to further explore the AWS Well-Architected Framework, you can try the following:
- Deploy a monitoring solution to the EKS cluster using Prometheus and Grafana.
- Use Istio to manage the service mesh.
- Alternatively, deploy a custom CNI plugin to the EKS cluster that already includes service mesh, such as Cilium.
- Use FluxCD to deploy and manage the Kubernetes applications.
- Use Flagger to automate the canary deployments.
- Configure Managed Node Groups for the EKS cluster.
- Configure EKS Anywhere for the EKS cluster.
- Configure EKS Distro for the EKS cluster.