Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Bottlerocket Update Operator Addon #349

Merged
merged 17 commits into from
Feb 8, 2024
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,8 @@ module "eks_blueprints_addons" {
| <a name="module_aws_node_termination_handler"></a> [aws\_node\_termination\_handler](#module\_aws\_node\_termination\_handler) | aws-ia/eks-blueprints-addon/aws | 1.1.1 |
| <a name="module_aws_node_termination_handler_sqs"></a> [aws\_node\_termination\_handler\_sqs](#module\_aws\_node\_termination\_handler\_sqs) | terraform-aws-modules/sqs/aws | 4.0.1 |
| <a name="module_aws_privateca_issuer"></a> [aws\_privateca\_issuer](#module\_aws\_privateca\_issuer) | aws-ia/eks-blueprints-addon/aws | 1.1.1 |
| <a name="module_bottlerocket_update_operator"></a> [bottlerocket\_update\_operator](#module\_bottlerocket\_update\_operator) | aws-ia/eks-blueprints-addon/aws | ~> 1.1.1 |
| <a name="module_bottlerocket_update_operator_crds"></a> [bottlerocket\_update\_operator\_crds](#module\_bottlerocket\_update\_operator\_crds) | aws-ia/eks-blueprints-addon/aws | ~> 1.1.1 |
| <a name="module_cert_manager"></a> [cert\_manager](#module\_cert\_manager) | aws-ia/eks-blueprints-addon/aws | 1.1.1 |
| <a name="module_cluster_autoscaler"></a> [cluster\_autoscaler](#module\_cluster\_autoscaler) | aws-ia/eks-blueprints-addon/aws | 1.1.1 |
| <a name="module_cluster_proportional_autoscaler"></a> [cluster\_proportional\_autoscaler](#module\_cluster\_proportional\_autoscaler) | aws-ia/eks-blueprints-addon/aws | 1.1.1 |
Expand Down Expand Up @@ -168,6 +170,8 @@ module "eks_blueprints_addons" {
| <a name="input_aws_node_termination_handler_asg_arns"></a> [aws\_node\_termination\_handler\_asg\_arns](#input\_aws\_node\_termination\_handler\_asg\_arns) | List of Auto Scaling group ARNs that AWS Node Termination Handler will monitor for EC2 events | `list(string)` | `[]` | no |
| <a name="input_aws_node_termination_handler_sqs"></a> [aws\_node\_termination\_handler\_sqs](#input\_aws\_node\_termination\_handler\_sqs) | AWS Node Termination Handler SQS queue configuration values | `any` | `{}` | no |
| <a name="input_aws_privateca_issuer"></a> [aws\_privateca\_issuer](#input\_aws\_privateca\_issuer) | AWS PCA Issuer add-on configurations | `any` | `{}` | no |
| <a name="input_bottlerocket_update_operator"></a> [bottlerocket\_update\_operator](#input\_bottlerocket\_update\_operator) | Bottlerocket Update Operator add-on configuration values | `any` | `{}` | no |
| <a name="input_bottlerocket_update_operator_crds"></a> [bottlerocket\_update\_operator\_crds](#input\_bottlerocket\_update\_operator\_crds) | Bottlerocket Update Operator CRDs configuration values | `any` | `{}` | no |
| <a name="input_cert_manager"></a> [cert\_manager](#input\_cert\_manager) | cert-manager add-on configuration values | `any` | `{}` | no |
| <a name="input_cert_manager_route53_hosted_zone_arns"></a> [cert\_manager\_route53\_hosted\_zone\_arns](#input\_cert\_manager\_route53\_hosted\_zone\_arns) | List of Route53 Hosted Zone ARNs that are used by cert-manager to create DNS records | `list(string)` | <pre>[<br> "arn:aws:route53:::hostedzone/*"<br>]</pre> | no |
| <a name="input_cluster_autoscaler"></a> [cluster\_autoscaler](#input\_cluster\_autoscaler) | Cluster Autoscaler add-on configuration values | `any` | `{}` | no |
Expand All @@ -192,6 +196,7 @@ module "eks_blueprints_addons" {
| <a name="input_enable_aws_load_balancer_controller"></a> [enable\_aws\_load\_balancer\_controller](#input\_enable\_aws\_load\_balancer\_controller) | Enable AWS Load Balancer Controller add-on | `bool` | `false` | no |
| <a name="input_enable_aws_node_termination_handler"></a> [enable\_aws\_node\_termination\_handler](#input\_enable\_aws\_node\_termination\_handler) | Enable AWS Node Termination Handler add-on | `bool` | `false` | no |
| <a name="input_enable_aws_privateca_issuer"></a> [enable\_aws\_privateca\_issuer](#input\_enable\_aws\_privateca\_issuer) | Enable AWS PCA Issuer | `bool` | `false` | no |
| <a name="input_enable_bottlerocket_update_operator"></a> [enable\_bottlerocket\_update\_operator](#input\_enable\_bottlerocket\_update\_operator) | Enable Bottlerocket Update Operator add-on | `bool` | `false` | no |
| <a name="input_enable_cert_manager"></a> [enable\_cert\_manager](#input\_enable\_cert\_manager) | Enable cert-manager add-on | `bool` | `false` | no |
| <a name="input_enable_cluster_autoscaler"></a> [enable\_cluster\_autoscaler](#input\_enable\_cluster\_autoscaler) | Enable Cluster autoscaler add-on | `bool` | `false` | no |
| <a name="input_enable_cluster_proportional_autoscaler"></a> [enable\_cluster\_proportional\_autoscaler](#input\_enable\_cluster\_proportional\_autoscaler) | Enable Cluster Proportional Autoscaler | `bool` | `false` | no |
Expand Down Expand Up @@ -248,6 +253,7 @@ module "eks_blueprints_addons" {
| <a name="output_aws_load_balancer_controller"></a> [aws\_load\_balancer\_controller](#output\_aws\_load\_balancer\_controller) | Map of attributes of the Helm release and IRSA created |
| <a name="output_aws_node_termination_handler"></a> [aws\_node\_termination\_handler](#output\_aws\_node\_termination\_handler) | Map of attributes of the Helm release and IRSA created |
| <a name="output_aws_privateca_issuer"></a> [aws\_privateca\_issuer](#output\_aws\_privateca\_issuer) | Map of attributes of the Helm release and IRSA created |
| <a name="output_bottlerocket_update_operator"></a> [bottlerocket\_update\_operator](#output\_bottlerocket\_update\_operator) | Map of attributes of the Helm release and IRSA created |
| <a name="output_cert_manager"></a> [cert\_manager](#output\_cert\_manager) | Map of attributes of the Helm release and IRSA created |
| <a name="output_cluster_autoscaler"></a> [cluster\_autoscaler](#output\_cluster\_autoscaler) | Map of attributes of the Helm release and IRSA created |
| <a name="output_cluster_proportional_autoscaler"></a> [cluster\_proportional\_autoscaler](#output\_cluster\_proportional\_autoscaler) | Map of attributes of the Helm release and IRSA created |
Expand Down
202 changes: 202 additions & 0 deletions docs/addons/bottlerocket.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,202 @@
# Bottlerocket and Bottlerocket Update Operator

[Bottlerocket](https://aws.amazon.com/bottlerocket/) is a Linux-based open-source operating system that focuses on security and maintainability, providing a reliable, consistent, and safe platform for container-based workloads.

The [Bottlerocket Update Operator (BRUPOP)](https://github.com/bottlerocket-os/bottlerocket-update-operator/tree/develop) is a Kubernetes operator that coordinates Bottlerocket updates on hosts in a cluster. It relies on a controller deployment on one node to orchestrate updates across the cluster, an agent daemon set on every Bottlerocket node, which is responsible for periodically querying and performing updates rolled out in waves to reduce the impact of issues, and an API Server that performs additional authorization.

[Cert-manager](https://cert-manager.io/) is required for the API server to use a CA certificate when communicating over SSL with the agents.

- [Helm charts](https://github.com/bottlerocket-os/bottlerocket-update-operator/tree/develop/deploy/charts)

## Requirements

BRUPOP perform updates on Nodes running with Bottlerocket OS only. Here are some code snippets of how to setup up Bottlerocket OS Nodes using Managed Node Groups with [Terraform Amazon EKS module](https://registry.terraform.io/modules/terraform-aws-modules/eks/aws/latest) and [Karpenter Node Classes](https://karpenter.sh/docs/concepts/nodeclasses/).

Notice the label `bottlerocket.aws/updater-interface-version=2.0.0` set in the `[settings.kubernetes.node-labels]` section. This label is required for the BRUPOP Agent to query and perform updates. Nodes not labeled will not be checked by the agent.

### Managed Node Groups

```hcl
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 19.21"
...
eks_managed_node_groups = {
bottlerocket = {
platform = "bottlerocket"
ami_type = "BOTTLEROCKET_x86_64"
instance_types = ["m5.large", "m5a.large"]

iam_role_attach_cni_policy = true

min_size = 1
max_size = 5
desired_size = 3

enable_bootstrap_user_data = true
bootstrap_extra_args = <<-EOT
[settings.host-containers.admin]
enabled = false
[settings.host-containers.control]
enabled = true
[settings.kernel]
lockdown = "integrity"
[settings.kubernetes.node-labels]
"bottlerocket.aws/updater-interface-version" = "2.0.0"
[settings.kubernetes.node-taints]
"CriticalAddonsOnly" = "true:NoSchedule"
EOT
}
}
}
```

### Karpenter

```yaml
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: bottlerocket-example
spec:
...
amiFamily: Bottlerocket
userData: |
[settings.kubernetes]
"kube-api-qps" = 30
"shutdown-grace-period" = "30s"
"shutdown-grace-period-for-critical-pods" = "30s"
[settings.kubernetes.eviction-hard]
"memory.available" = "20%"
[settings.kubernetes.node-labels]
"bottlerocket.aws/updater-interface-version" = "2.0.0"
```

## Usage

[BRUPOP](https://github.com/aws-ia/terraform-aws-eks-blueprints-addons/) can be deployed with the default configuration by enabling the add-on via the following. Notice the parameter `wait = true` set for Cert-Manager, this is needed since BRUPOP requires that Cert-Manager CRDs are already present in the cluster to be deployed.

```hcl
module "eks_blueprints_addons" {
source = "aws-ia/eks-blueprints-addons/aws"
version = "~> 1.13"

cluster_name = module.eks.cluster_name
cluster_endpoint = module.eks.cluster_endpoint
cluster_version = module.eks.cluster_version
oidc_provider_arn = module.eks.oidc_provider_arn

enable_cert_manager = true
cert_manager = {
wait = true
}
enable_bottlerocket_update_operator = true
}
```

You can also customize the Helm charts that deploys `bottlerocket_update_operator` and the `bottlerocket_update_operator_crds` via the following configuration:

```hcl
enable_bottlerocket_update_operator = true

bottlerocket_update_operator = {
name = "brupop"
description = "A Helm chart for BRUPOP"
chart_version = "1.3.0"
namespace = "brupop"
set = [{
name = "scheduler_cron_expression"
value = "0 * * * * * *" # Default Unix Cron syntax, set to check every hour. Example "0 0 23 * * Sat *" Perform update checks every Saturday at 23H / 11PM
}]
}

bottlerocket_update_operator_crds = {
name = "brupop-crds"
description = "A Helm chart for BRUPOP CRDs"
chart_version = "1.0.0"
}
```

To see a complete working example, see the [`bottlerocket`](https://github.com/aws-ia/terraform-aws-eks-blueprints/tree/main/patterns/bottlerocket) Blueprints Pattern.

## Validate

1. Run `update-kubeconfig` command:

```bash
aws eks --region <REGION> update-kubeconfig --name <CLUSTER_NAME>
```

2. Test by listing velero resources provisioned:

```bash
$ kubectl -n brupop-bottlerocket-aws get all

NAME READY STATUS RESTARTS AGE
pod/brupop-agent-5nv6m 1/1 Running 1 (33h ago) 33h
pod/brupop-agent-h4vw9 1/1 Running 1 (33h ago) 33h
pod/brupop-agent-sr9ms 1/1 Running 2 (33h ago) 33h
pod/brupop-apiserver-6ccb74f599-4c9lv 1/1 Running 0 33h
pod/brupop-apiserver-6ccb74f599-h6hg8 1/1 Running 0 33h
pod/brupop-apiserver-6ccb74f599-svw8n 1/1 Running 0 33h
pod/brupop-controller-deployment-58d46595cc-7vxnt 1/1 Running 0 33h

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/brupop-apiserver ClusterIP 172.20.153.72 <none> 443/TCP 33h
service/brupop-controller-server ClusterIP 172.20.7.127 <none> 80/TCP 33h

NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/brupop-agent 3 3 3 3 3 <none> 33h

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/brupop-apiserver 3/3 3 3 33h
deployment.apps/brupop-controller-deployment 1/1 1 1 33h

NAME DESIRED CURRENT READY AGE
replicaset.apps/brupop-apiserver-6ccb74f599 3 3 3 33h
replicaset.apps/brupop-controller-deployment-58d46595cc 1 1 1 33h

$ kubectl describe apiservices.apiregistration.k8s.io v2.brupop.bottlerocket.aws
Name: v2.brupop.bottlerocket.aws
Namespace:
Labels: kube-aggregator.kubernetes.io/automanaged=true
Annotations: <none>
API Version: apiregistration.k8s.io/v1
Kind: APIService
Metadata:
Creation Timestamp: 2024-01-30T16:27:15Z
Resource Version: 8798
UID: 034abe22-7e5f-4040-9b64-8ca9d55a4af6
Spec:
Group: brupop.bottlerocket.aws
Group Priority Minimum: 1000
Version: v2
Version Priority: 100
Status:
Conditions:
Last Transition Time: 2024-01-30T16:27:15Z
Message: Local APIServices are always available
Reason: Local
Status: True
Type: Available
Events: <none>
```

1. If not set during the deployment, add the required label `bottlerocket.aws/updater-interface-version=2.0.0` as shown below to all the Nodes that you want to have updates handled by BRUPOP.

```bash
$ kubectl label node ip-10-0-34-87.us-west-2.compute.internal bottlerocket.aws/updater-interface-version=2.0.0
node/ip-10-0-34-87.us-west-2.compute.internal labeled

$ kubectl get nodes -L bottlerocket.aws/updater-interface-version
NAME STATUS ROLES AGE VERSION UPDATER-INTERFACE-VERSION
ip-10-0-34-87.us-west-2.compute.internal Ready <none> 34h v1.28.1-eks-d91a302 2.0.0
```

4. Because the default cron schedule for BRUPOP is set to check for updates every minute, you'll be able to see in a few minutes that the Node had it's version updated automatically with no downtime.

```bash
kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-10-0-34-87.us-west-2.compute.internal Ready <none> 34h v1.28.4-eks-d91a302
```
Loading
Loading