Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add doc for "Controlling Disruption", document SafeToEvict #2924

Merged
merged 7 commits into from
Feb 1, 2023

Conversation

zmerlynn
Copy link
Collaborator

This adds an "Advanced" doc for "Controlling Disruption", documenting the SafeToEvict Alpha.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 8bd3d516-e352-4373-9674-2615ee3a71d3

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 694bc201-3bcc-4c87-8b65-13be3c526568

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 1980b9b5-c016-4e68-89c5-7371492cf5e6

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2924/head:pr_2924 && git checkout pr_2924
  • helm install agones ./install/helm/agones --namespace agones-system --agones.image.release=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.30.0-a811225-amd64

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: f534a55a-ffdc-4839-9b01-57f3380c65b1

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2924/head:pr_2924 && git checkout pr_2924
  • helm install agones ./install/helm/agones --namespace agones-system --agones.image.release=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.30.0-e6647a2-amd64

@@ -97,6 +97,12 @@ This affects the Cluster autoscaler, Allocation Scheduling, Pod Scheduling and F

#### Cluster Autoscaler

{{< alert title="SafeToEvict Feature Gate" color="info" >}}
The [Alpha]({{< ref "/docs/Guides/feature-stages.md#alpha" >}}) `SafeToEvict` feature allows
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to best phrase this, but we want to convey that it provides a better way to control disruption than the annotation (which is only for the autoscaler) as it allows the game to establish a contract about it's behavior.

Maybe you cover this more in the longer doc - reviewing this small change first.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to phrase it a little better. WDYT about actually just providing an eviction sample here? i.e. split the samples up with

Using the eviction API
The hard way

or something?

site/content/en/docs/Advanced/controlling-disruption.md Outdated Show resolved Hide resolved
site/content/en/docs/Advanced/controlling-disruption.md Outdated Show resolved Hide resolved
site/content/en/docs/Advanced/controlling-disruption.md Outdated Show resolved Hide resolved
site/content/en/docs/Advanced/controlling-disruption.md Outdated Show resolved Hide resolved
site/content/en/docs/Advanced/controlling-disruption.md Outdated Show resolved Hide resolved

* **Ten minutes:** Cluster Autoscaler respects [ten minutes of graceful termination](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#does-ca-respect-gracefultermination-in-scale-down) on scale-down. On some cloud products, you can configure `--max-graceful-termination-sec` to change this, but it is not advised: Cluster Autoscaler is currently only capable of scaling down one node at a time, and larger graceful termination windows slow this down farther (see [autoscaler#5079](https://github.com/kubernetes/autoscaler/issues/5079)). If the ten minute limit does not apply to you, generally you should choose between `safe: Always` (for sessions less than an hour), or see [below](#considerations-for-long-sessions).

* **One hour:** On many cloud products, `PodDisruptionBudget` can only block node upgrade evictions for a certain period of time - on GKE this is 1h. After that, the PDB is ignored, or the node upgrade fails with an error. Controlling `Pod` disruption for longer than one hour requires cluster configuration changes outside of Agones - see [below](#considerations-for-long-sessions).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know what the PBD timeout is on other cloud providers?

On GKE 1h is also the max time given for graceful termination before a pod is forcefully removed during node drain.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell gleaning other providers documentations, I think they err on the side of erroring rather than proceeding. I am still trying to research this.

site/content/en/docs/Advanced/controlling-disruption.md Outdated Show resolved Hide resolved
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: f68a50d2-7eea-4438-bb21-207afb2402d0

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2924/head:pr_2924 && git checkout pr_2924
  • helm install agones ./install/helm/agones --namespace agones-system --agones.image.release=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.30.0-6c078ce-amd64

@zmerlynn
Copy link
Collaborator Author

PTAL - restructured this a bit.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 3f5e9d61-f4a8-4261-a3cd-7afd20b1555d

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 24261da8-0f7e-4b33-9e39-8a2e528b443e

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2924/head:pr_2924 && git checkout pr_2924
  • helm install agones ./install/helm/agones --namespace agones-system --agones.image.release=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.30.0-63eda4f-amd64

@zmerlynn
Copy link
Collaborator Author

PTAL, updated to include a flowchart diagram and reworked some language.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: a82b94ce-524f-4447-ba25-798291806986

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 46b58fcf-26b6-4831-9658-421bd9bb0bc1

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2924/head:pr_2924 && git checkout pr_2924
  • helm install agones ./install/helm/agones --namespace agones-system --agones.image.release=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.30.0-7947a2c-amd64

@google-oss-prow google-oss-prow bot added the lgtm label Feb 1, 2023
@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: roberthbailey, zmerlynn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot added approved and removed lgtm labels Feb 1, 2023
@google-oss-prow
Copy link

New changes are detected. LGTM label has been removed.

@roberthbailey roberthbailey enabled auto-merge (squash) February 1, 2023 01:29
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: fc515088-df4f-4ae3-9422-d52cf18958e4

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2924/head:pr_2924 && git checkout pr_2924
  • helm install agones ./install/helm/agones --namespace agones-system --agones.image.release=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.30.0-decb99b-amd64

@roberthbailey roberthbailey merged commit aa41e4e into googleforgames:main Feb 1, 2023
@Kalaiselvi84 Kalaiselvi84 added this to the 1.30.0 milestone Feb 28, 2023
@Kalaiselvi84 Kalaiselvi84 added the kind/feature New features for Agones label Feb 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved kind/feature New features for Agones size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants