Skip to content

Commit

Permalink
Allow bootstrap re-apply for Fedora CoreOS GCP
Browse files Browse the repository at this point in the history
* Problem: Fedora CoreOS images are manually uploaded to GCP. When a
cluster is created with a stale image, Zincati immediately checks
for the latest stable image, fetches, and reboots. In practice,
this can unfortunately occur exactly during the initial cluster
bootstrap phase.

* Recommended: Upload the latest Fedora CoreOS image regularly
* Mitigation: Allow a failed bootstrap.service run (which won't touch
the done ConditionalPathExists) to be re-run by running `terraforma apply`
again. Add a known issue to CHANGES
* Update docs to show the current Fedora CoreOS stable version to
reduce likelihood users see this issue

 Longer term ideas:

* Ideal: Fedora CoreOS publishes a stable channel. Instances will always
boot with the latest image in a channel. The problem disappears since
it works the same way AWS does
* Timer: Consider some timer-based approach to have zincati delay any
system reboots for the first ~30 min of a machine's life. Possibly just
configured on the controller node coreos/zincati#251
* External coordination: For Container Linux, locksmith filled a similar
role and was disabled to allow CLUO to coordinate reboots. By running
atop Kubernetes, it was not possible for the reboot to occur before
cluster bootstrap
* Rely on coreos/zincati#115 to delay the
reboot since bootstrap involves an SSH session
* Use path-based activation of zincati on controllers and set that
path at the end of the bootstrap process
  • Loading branch information
dghubble committed Mar 29, 2020
1 parent 144bb94 commit 5ab00de
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 6 deletions.
4 changes: 4 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,10 @@ Notable changes between versions.

* Update default `os_stream` from testing to stable

#### Google Cloud

* Known: Use of stale uploaded Fedora CoreOS image will require terraform re-apply

#### DigitalOcean

* Rename `image` variable to `os_image` for consistency ([#677](https://github.com/poseidon/typhoon/pull/677)) (action required)
Expand Down
9 changes: 3 additions & 6 deletions docs/fedora-coreos/google-cloud.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,5 @@
# Google Cloud

!!! danger
Typhoon for Fedora CoreOS is an alpha. Please report Fedora CoreOS bugs to [Fedora](https://github.com/coreos/fedora-coreos-tracker/issues) and Typhoon issues to Typhoon.

In this tutorial, we'll create a Kubernetes v1.18.0 cluster on Google Compute Engine with Fedora CoreOS.

We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a network, firewall rules, health checks, controller instances, worker managed instance group, load balancers, and TLS assets.
Expand Down Expand Up @@ -76,13 +73,13 @@ Fedora CoreOS publishes images for Google Cloud, but does not yet upload them. G

```
gsutil list
gsutil cp fedora-coreos-31.20200113.3.1-gcp.x86_64.tar.gz gs://BUCKET
gsutil cp fedora-coreos-31.20200310.3.0-gcp.x86_64.tar.gz gs://BUCKET
```

Create a Compute Engine image from the file.

```
gcloud compute images create fedora-coreos-31-20200113-3-1 --source-uri gs://BUCKET/fedora-coreos-31.20200113.3.1-gcp.x86_64.tar.gz
gcloud compute images create fedora-coreos-31-20200310-3-0 --source-uri gs://BUCKET/fedora-coreos-31.20200310.3.0-gcp.x86_64.tar.gz
```

## Cluster
Expand All @@ -100,7 +97,7 @@ module "yavin" {
dns_zone_name = "example-zone"
# custom image name from above
os_image = "fedora-coreos-31-20200113-3-1"
os_image = "fedora-coreos-31-20200310-3-0"
# configuration
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
Expand Down
1 change: 1 addition & 0 deletions google-cloud/fedora-coreos/kubernetes/fcc/controller.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,7 @@ systemd:
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootstrap
ExecStartPre=-/usr/bin/podman rm bootstrap
ExecStart=/usr/bin/podman run --name bootstrap \
--network host \
--volume /etc/kubernetes/bootstrap-secrets:/etc/kubernetes/secrets:ro,Z \
Expand Down

0 comments on commit 5ab00de

Please sign in to comment.