Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add doc for "Controlling Disruption", document SafeToEvict #2924

Merged
merged 7 commits into from
Feb 1, 2023
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions build/includes/website.mk
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ site-test:

# generate site images, if they don't exist
site-images: $(site_path)/static/diagrams/gameserver-states.dot.png
site-images: $(site_path)/static/diagrams/eviction-decision.dot.png
site-images: $(site_path)/static/diagrams/gameserver-lifecycle.puml.png
site-images: $(site_path)/static/diagrams/gameserver-reserved.puml.png
site-images: $(site_path)/static/diagrams/canary-testing.puml.png
Expand Down
113 changes: 113 additions & 0 deletions site/content/en/docs/Advanced/controlling-disruption.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
---
title: "Controlling Disruption"
date: 2023-01-24T20:15:26Z
weight: 20
description: >
Game servers running on Agones may be disrupted by Kubernetes; learn how to control disruption of your game servers.
---

## Disruption in Kubernetes

[A `Pod` in Kubernetes may be disrupted](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/#voluntary-and-involuntary-disruptions) for involuntary reasons, e.g. hardware failure, or voluntary reasons, such as when nodes are drained for upgrades.

By default, Agones assumes your game server should never be disrupted voluntarily and configures the `Pod` appropriately - but this isn't always the ideal setting. Here we discuss how Agones allows you to control the two most significant sources of voluntary `Pod` evictions, node upgrades and Cluster Autoscaler, using the `eviction` API on the `GameServer` object.

{{< alpha title="`eviction` API" gate="SafeToEvict" >}}
zmerlynn marked this conversation as resolved.
Show resolved Hide resolved

## Benefits of Allowing Voluntary Disruption

It's not always easy to write your game server in a way that allows for disruption, but it can have major benefits:

* Compaction of your cluster using [Cluster Autoscaler](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler) can lead to considerable cost savings for your infrastructure.
* Allowing automated node upgrades can save you management toil, and lowers the time it takes to patch security vulnerabilites.

## Considerations

When discussing game server pod disruption, it's important to keep two factors in mind:

* **`TERM` signal:** Is your game server tolerant of graceful termination? If you wish to support voluntary disruption, your game server must handle the `TERM` signal (even if it runs to completion after receiving `TERM`).
* **Termination Grace Period:** After receiving `TERM`, how long does your game server need to run? If you run to completion after receiving `TERM`, this is equivalent to the session length - if not, you can think of this as the cleanup time. In general, we bucket the grace period into "less than 10 minutes", "10 minutes to an hour", and "greater than an hour". (See [below](#whats-special-about-ten-minutes-and-one-hour) if you are curious about grace period considerations.)

## `eviction` API

The `eviction` API is specified as part of the `GameServerSpec`, like:

```yaml
apiVersion: "agones.dev/v1"
kind: GameServer
metadata:
name: "simple-game-server"
spec:
eviction:
safe: Always
template:
[...]
```

You can set `eviction.safe` based on your game server's tolerance for disruption and session length, based on the following diagram:

![Eviction Decision Diagram](../../../diagrams/eviction-decision.dot.png)

In words:

* Does the game server support `TERM` and terminate within ten minutes?
* Yes to both: Set `safe: Always`, and set [terminationGracePeriodSeconds](https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#hook-handler-execution) to the session length or cleanup time.
* No to either: Does the game server support `TERM` and terminate within an hour?
* Yes to both: Set `safe: OnUpgrade`, and configure [terminationGracePeriodSeconds](https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#hook-handler-execution) to the session length or cleanup time.
* No to either: Set `safe: Never`. If your game server does not terminate within an hour, see [below](#considerations-for-long-sessions).

{{< alert title="Note" color="info" >}}
To maintain backward compatibility with Agones prior to the introduction of the `SafeToEvict` feature gate, if your game server previously configured the `cluster-autoscaler.kubernetes.io/safe-to-evict: true` annotation, we assume `eviction.safe: Always` is intended.
{{</ alert >}}

{{< alert title="Note" color="info" >}}
GKE Autopilot supports only `Never` and `Always`, not `OnUpgrade`.
{{< /alert >}}

## What's special about ten minutes and one hour?

* **Ten minutes:** Cluster Autoscaler respects [ten minutes of graceful termination](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#does-ca-respect-gracefultermination-in-scale-down) on scale-down. On some cloud products, you can configure `--max-graceful-termination-sec` to change this, but it is not advised: Cluster Autoscaler is currently only capable of scaling down one node at a time, and larger graceful termination windows slow this down farther (see [autoscaler#5079](https://github.com/kubernetes/autoscaler/issues/5079)). If the ten minute limit does not apply to you, generally you should choose between `safe: Always` (for sessions less than an hour), or see [below](#considerations-for-long-sessions).

* **One hour:** On many cloud products, `PodDisruptionBudget` can only block node upgrade evictions for a certain period of time - on GKE this is 1h. After that, the PDB is ignored, or the node upgrade fails with an error. Controlling `Pod` disruption for longer than one hour requires cluster configuration changes outside of Agones - see [below](#considerations-for-long-sessions).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know what the PBD timeout is on other cloud providers?

On GKE 1h is also the max time given for graceful termination before a pod is forcefully removed during node drain.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell gleaning other providers documentations, I think they err on the side of erroring rather than proceeding. I am still trying to research this.


## Considerations for long sessions

Outside of Cluster Autoscaler, the main source of disruption for long sessions is node upgrade. On some cloud products, such as GKE Standard, node upgrades are entirely within your control. On others, such as GKE Autopilot, node upgrade is automatic. Typical node upgrades use an eviction based, rolling recreate strategy, and may not honor `PodDisruptionBudget` for longer than an hour. Here we document strategies you can use for your cloud product to support long sessions.

### On GKE

On GKE, there are currently two possible approaches to manage disruption for session lengths longer than an hour:

* (GKE Standard/Autopilot) [Blue/green deployment](https://martinfowler.com/bliki/BlueGreenDeployment.html) at the cluster level: If you are using an automated deployment process, you can:
* create a new, `green` cluster within a release channel e.g. every week,
* use [maintenance exclusions](https://cloud.google.com/kubernetes-engine/docs/concepts/maintenance-windows-and-exclusions#exclusions) to prevent node upgrades for 30d, and
* scale the `Fleet` on the old, `blue` cluster down to 0, and
* use [multi-cluster allocation]({{< relref "multi-cluster-allocation.md" >}}) on Agones, which will then direct new allocations to the new `green` cluster (since `blue` has 0 desired), then
* delete the old, `blue` cluster when the `Fleet` successfully scales down.

* (GKE Standard only) Use [node pool blue/green upgrades](https://cloud.google.com/kubernetes-engine/docs/concepts/node-pool-upgrade-strategies#blue-green-upgrade-strategy)

### Other cloud products

The blue/green cluster strategy described for GKE is likely applicable to your cloud product.

We welcome contributions to this section for other products!

## Implementation / Under the hood

Each option uses a slightly different permutation of:
* the `safe-to-evict` annotation to block [Cluster Autoscaler based eviction](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-types-of-pods-can-prevent-ca-from-removing-a-node), and
* the `agones.dev/safe-to-evict` label selector to select the `agones-gameserver-safe-to-evict-false` `PodDisruptionBudget`. This blocks [Cluster Autoscaler](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-types-of-pods-can-prevent-ca-from-removing-a-node) and (for a limited time) [disruption from node upgrades](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/#pod-disruption-budgets).
* Note that PDBs do influence pod preemption as well, but it's not guaranteed.

As a quick reference:

| evictions.safe setting | `safe-to-evict` pod annotation | `agones.dev/safe-to-evict` label |
|-------------------------|---------------------------------|-----------------------------------|
| `Never` (default) | `false` | `false` (matches PDB) |
| `OnUpdate` | `false` | `true` (does not match PDB) |
| `Always` | `true` | `true` (does not match PDB) |

## Further Reading

* [`eviction` design](https://github.com/googleforgames/agones/issues/2794)
7 changes: 7 additions & 0 deletions site/content/en/docs/Advanced/scheduling-and-autoscaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,13 @@ When using the “Packed” strategy, Agones will ensure that the Cluster Autosc
gameplay by adding the annotation [`"cluster-autoscaler.kubernetes.io/safe-to-evict": "false"`](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-types-of-pods-can-prevent-ca-from-removing-a-node)
to the backing Pod.

{{< alert title="SafeToEvict Feature Gate" color="info" >}}
The [Alpha]({{< ref "/docs/Guides/feature-stages.md#alpha" >}}) `SafeToEvict` feature allows
[controlling disruption]({{< relref "controlling-disruption.md" >}}) in a more holistic way.
Please consider enabling `SafeToEvict` and using the new `eviction` API - we welcome your
early feedback!
{{< /alert >}}

However, if a gameserver can tolerate [being evicted](https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works)
(generally in combination with setting an appropriate graceful termination period on the gameserver pod) and you
want the Cluster Autoscaler to compact your cluster by evicting game servers when it would allow the Cluster
Expand Down
2 changes: 1 addition & 1 deletion site/layouts/shortcodes/alpha.html
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{{- $gate := split (.Get "gate") "," }}
{{- $len_gate := len $gate }}
{{- $title := .Get "title" }}
{{- $title := .Get "title" | markdownify }}
<div class="alert alert-warning" role="alert">
<h4 class="alert-heading">Warning</h4>
<p>The {{ $title }} {{- if gt $len_gate 1 }} features are{{ else }} feature is{{ end }} currently <strong><a href="{{ ref . "/docs/Guides/feature-stages.md#alpha" }}">Alpha</a></strong>,
Expand Down
2 changes: 1 addition & 1 deletion site/layouts/shortcodes/beta.html
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{{- $gate := .Get "gate" }}
{{- $title := .Get "title" }}
{{- $title := .Get "title" | markdownify }}
<div class="alert alert-warning" role="alert">
<h4 class="alert-heading">Warning</h4>
<p>The {{ $title }} feature is currently <strong><a href="{{ ref . "/docs/Guides/feature-stages.md#beta" }}">Beta</a></strong>,
Expand Down
29 changes: 29 additions & 0 deletions site/static/diagrams/eviction-decision.dot
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
digraph {
graph [fontname = "helvetica", ordering="out"];
node [fontname = "helvetica"];
edge [fontname = "helvetica", pad="0.2", penwidth="2"];

CanTerm [ label = "Supports TERM signal" ]
TenMinuteTermination [ label = "Terminates in < 10m after TERM?" ]
OneHourTermination [ label="Terminates in < 1h after TERM?" ]

SetAlways [label = <Set <font face="courier">safe: Always</font>>]
SetOnUpgrade [label = <Set <font face="courier">safe: OnUpgrade</font>>]
SetNever [label = <Set <font face="courier">safe: Never</font>>]

ConfigureTGPS [label = "Configure terminationGracePeriodSeconds\nto session or cleanup time"]
Special [label = "See Long Sessions below"]

CanTerm -> TenMinuteTermination [ label = "yes" ]
CanTerm -> SetNever [ label = "no" ]

TenMinuteTermination -> SetAlways [ label = "yes" ]
TenMinuteTermination -> OneHourTermination [ label="no" ]

OneHourTermination -> SetOnUpgrade [ label = "yes" ]
OneHourTermination -> SetNever [ label = "no" ]

SetAlways -> ConfigureTGPS [ label = "and" ]
SetOnUpgrade -> ConfigureTGPS [ label = "and" ]
SetNever -> Special [ label = "and" ]
}
Binary file added site/static/diagrams/eviction-decision.dot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.