Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for DeletionTimestamp of fleet and gameserverset before scaling #2526

Merged
merged 4 commits into from
Apr 7, 2022

Conversation

estebangarcia
Copy link
Contributor

What type of PR is this?

/kind bug

What this PR does / Why we need it:

It checks the DeletionTimestamp of GameServerSet and sets numServersToAdd to 0 and checks DeletionTimestamp of fleets managed by autoscaler and doesn't scale if they are marked for deletion.

This prevents agones from recreating gameservers from fleets marked for deletion by foreground cascading. As reported in #2524

Which issue(s) this PR fixes:
Closes #2524

Special notes for your reviewer:
Tested on AWS EKS 1.21.5

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: b1f530c3-2be9-49df-9d7d-413a0e40f7ba

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2526/head:pr_2526 && git checkout pr_2526
  • helm install ./install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.22.0-97246f3

Copy link
Member

@markmandel markmandel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This straight makes sense! Thanks so much for the fix!

The only thing that would block this from merge is to add some unit tests to make sure we never mess up this functionality in the future.

(Do we need e2e tests? 🤔 )

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 8d874f50-be96-41a5-85c0-e5fabca05ed3

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2526/head:pr_2526 && git checkout pr_2526
  • helm install ./install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.22.0-5a02f8c

@estebangarcia
Copy link
Contributor Author

Hey, @markmandel thanks for taking the time to review. I just added some unit tests to cover this.

I'm not sure if the e2e is necessary as it is a very small change, but I'll let you decide on that :)

@markmandel
Copy link
Member

This looks good! @roberthbailey how do we feel about including this while we're in feature freeze?

It is a bug fix, and seems to be low risk 🤔 WDYT? Or play it safe and bring it into the next release?

Copy link
Member

@roberthbailey roberthbailey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there also be a change in pkg/fleets/controller.go? If you delete a fleet with foreground cascading deletion and then proceed to update the fleet (e.g. change the container image) before the cascading deletion completes will the fleet controller create a new game server set which will then get orphaned?

pkg/fleetautoscalers/controller.go Show resolved Hide resolved
@@ -307,6 +307,12 @@ func (c *Controller) syncGameServerSet(ctx context.Context, key string) error {

numServersToAdd, toDelete, isPartial := computeReconciliationAction(gsSet.Spec.Scheduling, list, c.counter.Counts(),
int(gsSet.Spec.Replicas), maxGameServerCreationsPerBatch, maxGameServerDeletionsPerBatch, maxPodPendingCount)

// GameserverSet is marked for deletion then don't add gameservers.
if !gsSet.DeletionTimestamp.IsZero() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar question here - should we check for the finalizer being set?

@markmandel
Copy link
Member

Should there also be a change in pkg/fleets/controller.go?

Oooh! That is an excellent question. I would argue yes - even if just to ensure no extra resource creation in general when a Fleet is being deleted (either foreground or background).

@estebangarcia
Copy link
Contributor Author

Should there also be a change in pkg/fleets/controller.go?

I agree with this. I've just committed a change that considers this as well.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 55dec21f-58a6-4b12-ae6f-87f8d9be69fc

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@estebangarcia
Copy link
Contributor Author

@markmandel the e2e tests on Cloud Build seemed to have failed I see a lot of x509: certificate signed by unknown authority errors but I don't know if that's expected or not.

@roberthbailey roberthbailey added the feature-freeze-do-not-merge Only eligible to be merged once we are out of feature freeze (next full release) label Apr 4, 2022
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 7b89e579-1841-4ec4-964f-03ca01514c44

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Member

Looks like the deity of flakiness has decided to target this PR. Rerunning.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 273c82cc-396b-4f86-89bf-2295d4238ce5

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2526/head:pr_2526 && git checkout pr_2526
  • helm install ./install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.22.0-4286a22

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: b11be78b-571b-4270-b292-89355272654b

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@SaitejaTamma SaitejaTamma removed the feature-freeze-do-not-merge Only eligible to be merged once we are out of feature freeze (next full release) label Apr 5, 2022
@estebangarcia
Copy link
Contributor Author

I merged main to my fork, build is failing again. Going to need you to retry it when you can @markmandel

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: estebangarcia, markmandel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@markmandel
Copy link
Member

We are past feature freeze - and I fixed the 404 in our site testing, so this is good to merge 👍🏻

@google-oss-prow google-oss-prow bot removed the lgtm label Apr 7, 2022
@google-oss-prow
Copy link

New changes are detected. LGTM label has been removed.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: d126582e-f88c-4fac-9b91-f15b2653b946

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2526/head:pr_2526 && git checkout pr_2526
  • helm install ./install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.23.0-3ce3604

@markmandel markmandel merged commit e47df78 into googleforgames:main Apr 7, 2022
Thiryn pushed a commit to Thiryn/agones that referenced this pull request Apr 25, 2022
@SaitejaTamma SaitejaTamma added this to the 1.23.0 milestone May 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Foreground deletion of a fleet managed by fleet autoscaler results in infinite pod recreation loop
5 participants