Deploying a cluster with Talos and Terraform backed by ArgoCD and SOPS.
For provisioning the following tools are used:
- Talos - this is used to provision all nodes within cluster with uniform system and configuration as gitops
- Terraform - in order to help with the DNS settings this is used to provision an already existing Cloudflare domain and DNS settings
- cert-manager - SSL certificates - with Cloudflare DNS challenge
- flannel - CNI (container network interface)
- ArgoCD - GitOps tool for deploying manifests from the
cluster
directory - rook.io - ceph storage for k8s
- nfs - used for cold storage on QNAP
- metallb - bare metal load balancer
- traefik - ingress controller
- Nodes running Talos. These nodes are bare metals.
- A Cloudflare account with a domain, this will be managed by Terraform.
- QNAP used as NFS and S3 storage.
For fast setup I use devcontainer to have same environment across different devices. See more inside .devcontainer
and at Devcontainers
-
Install the most recent versions of the following command-line tools on your workstation, if you are using Homebrew on macOS or Linux skip to steps 3 and 4.
-
This guide heavily relies on go-task as a framework for setting things up. It is advised to learn and understand the commands it is running under the hood.
-
Install go-task via Brew
brew install go-task/tap/go-task
-
Install workstation dependencies via Brew
task init
It is advisable to install pre-commit and the pre-commit hooks that come with this repository. sops-pre-commit will check to make sure you are not by accident committing un-encrypted secrets.
-
Enable Pre-Commit
task precommit:init
-
Update Pre-Commit, though it will occasionally make mistakes, so verify its results.
task precommit:update
The Git repository contains the following directories under cluster
and are ordered below by how Argo CD will apply them.
📁 cluster
├──📁 projects - main folder for ArgoCD to sync deployed apps
├──📁 apps - folder for apps manifests
├──📁 core - folder for a core apps of cluster
│ └──📁 argocd
│ └──📁 projects - folder containing manifests to initialize app-of-apps for ArgoCD
└──📁 system - app counted as extensions of cluster (certs, ingress, gpu, etc.)
I assume you already have generated age key pair to be used otherwise you need to generate one.
Export the SOPS_AGE_KEY_FILE
variable in your bashrc
, zshrc
or config.fish
and source it, e.g.
export SOPS_AGE_KEY_FILE=~/.config/sops/age/keys.txt
source ~/.bashrc
In order to use Terraform and cert-manager
with the Cloudflare DNS challenge you will need to create a API Token.
-
Head over to Cloudflare and create a API Token by going here.
-
Under the
API Tokens
section, create a scoped API Token. -
Use the API Token in provision/terraform/cloudflare and cluster/system/cert-manager.
-
Get a ISO image of the installer from latest release
-
Configure nodes inside
provision/talos/talconfig.yaml
-
Run
task talos:init
to generate talos configs for each node -
Follow guide on Getting Started for details on Talos installation
📍 Review the Terraform scripts under ./provision/terraform/cloudflare/
and make sure you understand what it's doing (no really review it).
If your domain already has existing DNS records be sure to export those DNS settings before you continue.
Ideally you can update the terraform script to manage DNS for all records if you so choose to.
-
Pull in the Terraform deps by running
task terraform:init:cloudflare
-
Review the changes Terraform will make to your Cloudflare domain by running
task terraform:plan:cloudflare
-
Finally have Terraform execute the task by running
task terraform:apply:cloudflare
If Terraform was ran successfully you can log into Cloudflare and validate the DNS records are present.
📍 Here we will be installing ArgoCD after some quick bootstrap steps.
-
Verify ArgoCD can be installed
argocd version # argocd: vX.X.X # ...
-
Pre-create the
argocd
namespacekubectl create namespace argocd --dry-run=client -o yaml | kubectl apply -f -
-
Add the Age key in-order for ArgoCD to decrypt SOPS secrets
cat $SOPS_AGE_KEY_FILE | kubectl -n argocd create secret generic sops-age \ --from-file=age.agekey=/dev/stdin
-
Verify all files ending with
*.sops.yaml
or*.sec.yaml
are encrypted with SOPS -
Push you changes to git
git add -A git commit -m "encrypting secrets" git push
-
Install Argo CD
kubectl apply -k ./cluster/core/argocd/base
-
Verify Argo CD components are running in the cluster
kubectl get pods -n argocd
If all goes well and you have port forwarded
80
and443
in your router to theMETALLB_TRAEFIK_ADDR
IP, in a few moments head over to your browser and you should be able to accesshttps://hajimari.CLOUDFLARE_DOMAIN
🎉 Congratulations you have a Kubernetes cluster managed by Argo CD, your Git repository is driving the state of your cluster.
This section will be about upgrading k8s and onther components on your cluster using Talos.