-
Notifications
You must be signed in to change notification settings - Fork 813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: rollout strategy issues #3762
Conversation
Build Failed 😱 Build Id: 04b542f9-9a13-4947-bbd7-a3b8364d9b4c To get permission to view the Cloud Build view, join the agones-discuss Google Group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some suggestions for how to feature gate this change.
Is there a e2e type scenario (akin to what we see in #3438) you can think of that would highlight the issues you were running into - that would also mean we don't run into it again in the future.
4951912
to
7ad2e60
Compare
Build Failed 😱 Build Id: 0ce99505-3c74-45ec-aaa4-a8033f1d7f7d To get permission to view the Cloud Build view, join the agones-discuss Google Group. |
6a82465
to
78c20b2
Compare
Build Failed 😱 Build Id: 1193d3f4-b99a-4bc9-9bf4-a4896d0f6ac3 To get permission to view the Cloud Build view, join the agones-discuss Google Group. |
Build Succeeded 👏 Build Id: 621017df-61f9-4a38-a663-85d710f0cbc4 The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
Thanks for making these changes - I'm probably going to opt to let this one slide past the next release on Tuesday, and give me (someone? 😄 ) some time to sit down with this, and make sure everything is good (which really means running some rolling updates and watching them happen 😄 ) |
We have production data of this patch if that would help. |
TBH - totally trust that you do. I just want to know that I've spent some time with this as an extra verification step, and give it time to back between releases as well. |
Build Succeeded 👏 Build Id: 8981dcfc-9e41-418c-b309-09dffc754cc9 The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
a2f39cf
to
3a8aee7
Compare
Build Succeeded 👏 Build Id: 1d95bf2c-5e7e-4b13-ab31-569d414ea084 The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
Just a heads up - will take this for a spin this week 👍🏻 |
@@ -31,6 +31,7 @@ The current set of `alpha` and `beta` feature gates: | |||
| [Support for Extended Duration Pods on GKE Autopilot (*1.28+ only*)](https://github.com/googleforgames/agones/issues/3386) | `GKEAutopilotExtendedDurationPods` | Disabled | `Alpha` | 1.37.0 | | |||
| [GameServer player capacity filtering on GameServerAllocations](https://github.com/googleforgames/agones/issues/1239) | `PlayerAllocationFilter` | Disabled | `Alpha` | 1.14.0 | | |||
| [Player Tracking]({{< ref "/docs/Guides/player-tracking.md" >}}) | `PlayerTracking` | Disabled | `Alpha` | 1.6.0 | | |||
| [Rolling Update Fixes](https://github.com/googleforgames/agones/issues/3688) | `RollingUpdateFix` | Disabled | `Alpha` | 1.41.0 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small thing, can you make a copy of this table, one with the new feature and one without, and use the feature
shortcodes to keep this feature flag hidden until the next release please.
See: https://agones.dev/site/docs/contribute/#within-a-page for details
Build Succeeded 👏 Build Id: 5fca4a12-50e0-4401-9193-f5a92d9137a2 The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Build Succeeded 👏 Build Id: 819f3e3b-9f05-4011-b38c-22c9dbb01880 The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
This PR contains the following updates: | Package | Update | Change | |---|---|---| | [agones](https://agones.dev) ([source](https://github.com/googleforgames/agones)) | minor | `1.40.0` -> `1.41.0` | --- > [!WARNING] > Some dependencies could not be looked up. Check the Dependency Dashboard for more information. --- ### Release Notes <details> <summary>googleforgames/agones (agones)</summary> ### [`v1.41.0`](https://github.com/googleforgames/agones/blob/HEAD/CHANGELOG.md#v1410-2024-06-04) [Compare Source](https://github.com/googleforgames/agones/compare/v1.40.0...v1.41.0) [Full Changelog](https://github.com/googleforgames/agones/compare/v1.40.0...v1.41.0) **Implemented enhancements:** - Configure Allocator Status Code by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3782](https://github.com/googleforgames/agones/pull/3782) - Graduate Counters and Lists to Beta by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3801](https://github.com/googleforgames/agones/pull/3801) - Passthrough autopilot - Adds an AutopilotPassthroughPort Feature Gate and new pod label by [@​vicentefb](https://github.com/vicentefb) in [https://github.com/googleforgames/agones/pull/3809](https://github.com/googleforgames/agones/pull/3809) - CountsAndLists: Move to Beta Protobuf by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3806](https://github.com/googleforgames/agones/pull/3806) - feat: support multiple port ranges by [@​nrwiersma](https://github.com/nrwiersma) in [https://github.com/googleforgames/agones/pull/3747](https://github.com/googleforgames/agones/pull/3747) - Changes `sdk-server` to Patch instead of Update by [@​igooch](https://github.com/igooch) in [https://github.com/googleforgames/agones/pull/3803](https://github.com/googleforgames/agones/pull/3803) - Generate grpc for nodejs from alpha to beta by [@​lacroixthomas](https://github.com/lacroixthomas) in [https://github.com/googleforgames/agones/pull/3825](https://github.com/googleforgames/agones/pull/3825) - Update CountsAndLists from Alpha to Beta by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3824](https://github.com/googleforgames/agones/pull/3824) - feat(gameserver): New DirectToGameServer PortPolicy allows direct traffic to a GameServer by [@​daniellee](https://github.com/daniellee) in [https://github.com/googleforgames/agones/pull/3807](https://github.com/googleforgames/agones/pull/3807) - Passthrough autopilot - Adds mutating webhook by [@​vicentefb](https://github.com/vicentefb) in [https://github.com/googleforgames/agones/pull/3833](https://github.com/googleforgames/agones/pull/3833) - Passthrough autopilot - added ports array case and updated unit tests by [@​vicentefb](https://github.com/vicentefb) in [https://github.com/googleforgames/agones/pull/3842](https://github.com/googleforgames/agones/pull/3842) - Nodejs counters and lists by [@​steven-supersolid](https://github.com/steven-supersolid) in [https://github.com/googleforgames/agones/pull/3726](https://github.com/googleforgames/agones/pull/3726) - Promote AutopilotPassthroughPort feature gate to Alpha by [@​vicentefb](https://github.com/vicentefb) in [https://github.com/googleforgames/agones/pull/3849](https://github.com/googleforgames/agones/pull/3849) **Fixed bugs:** - Helm Param Update: Default to agones.controller if agones.extensions is Missing by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3773](https://github.com/googleforgames/agones/pull/3773) - fix: rollout strategy issues by [@​nrwiersma](https://github.com/nrwiersma) in [https://github.com/googleforgames/agones/pull/3762](https://github.com/googleforgames/agones/pull/3762) - Set Minimum Buffer Size to 1 by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3749](https://github.com/googleforgames/agones/pull/3749) - Pin ltsc2019 to older SHA by [@​zmerlynn](https://github.com/zmerlynn) in [https://github.com/googleforgames/agones/pull/3829](https://github.com/googleforgames/agones/pull/3829) - TestGameServerAllocationDuringMultipleAllocationClients: Readdress flake by [@​zmerlynn](https://github.com/zmerlynn) in [https://github.com/googleforgames/agones/pull/3831](https://github.com/googleforgames/agones/pull/3831) - Refactor finalizer name to include valid domain name and path by [@​indexjoseph](https://github.com/indexjoseph) in [https://github.com/googleforgames/agones/pull/3840](https://github.com/googleforgames/agones/pull/3840) - agones-{extensions,allocator}: Be more defensive about draining by [@​zmerlynn](https://github.com/zmerlynn) in [https://github.com/googleforgames/agones/pull/3839](https://github.com/googleforgames/agones/pull/3839) - agones-{extensions,allocator}: Pause after cancelling context by [@​zmerlynn](https://github.com/zmerlynn) in [https://github.com/googleforgames/agones/pull/3843](https://github.com/googleforgames/agones/pull/3843) - Change the line to modify in Quickstart: Edit a Game Server by [@​peterzhongyi](https://github.com/peterzhongyi) in [https://github.com/googleforgames/agones/pull/3844](https://github.com/googleforgames/agones/pull/3844) **Other:** - Prep for Release v1.41.0 by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3800](https://github.com/googleforgames/agones/pull/3800) - Update site documentation to reflect firewall prefix and default to Autopilot cluster creation for Agones by [@​vicentefb](https://github.com/vicentefb) in [https://github.com/googleforgames/agones/pull/3769](https://github.com/googleforgames/agones/pull/3769) - Add a System Diagram and overview page by [@​zmerlynn](https://github.com/zmerlynn) in [https://github.com/googleforgames/agones/pull/3792](https://github.com/googleforgames/agones/pull/3792) - Update Side Menu: Preserve and Restore Scroll Position by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3805](https://github.com/googleforgames/agones/pull/3805) - fix: typo by [@​skmpf](https://github.com/skmpf) in [https://github.com/googleforgames/agones/pull/3808](https://github.com/googleforgames/agones/pull/3808) - Helm Config: Add httpUnallocatedStatusCode in Allocator Service by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3802](https://github.com/googleforgames/agones/pull/3802) - Update Docs: CountersAndLists to Beta by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3810](https://github.com/googleforgames/agones/pull/3810) - Disable Dev feature FeatureAutopilotPassthroughPort by [@​vicentefb](https://github.com/vicentefb) in [https://github.com/googleforgames/agones/pull/3815](https://github.com/googleforgames/agones/pull/3815) - Disable FeatureAutopilotPassthroughPort in features.go by [@​vicentefb](https://github.com/vicentefb) in [https://github.com/googleforgames/agones/pull/3816](https://github.com/googleforgames/agones/pull/3816) - SDK proto compatibility guarantees and deprecation policies documentation by [@​igooch](https://github.com/igooch) in [https://github.com/googleforgames/agones/pull/3774](https://github.com/googleforgames/agones/pull/3774) - Fix dangling "as of" by [@​zmerlynn](https://github.com/zmerlynn) in [https://github.com/googleforgames/agones/pull/3827](https://github.com/googleforgames/agones/pull/3827) - Steps to Promote SDK Features from Alpha to Beta by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3814](https://github.com/googleforgames/agones/pull/3814) - Adds comment for help troubleshooting issues with terraform tfstate by [@​igooch](https://github.com/igooch) in [https://github.com/googleforgames/agones/pull/3822](https://github.com/googleforgames/agones/pull/3822) - docs: improve counter and list example comments by [@​yonbh](https://github.com/yonbh) in [https://github.com/googleforgames/agones/pull/3818](https://github.com/googleforgames/agones/pull/3818) - Skip /tmp/ on yamllint by [@​zmerlynn](https://github.com/zmerlynn) in [https://github.com/googleforgames/agones/pull/3838](https://github.com/googleforgames/agones/pull/3838) - TestAllocatorAfterDeleteReplica: More logging by [@​zmerlynn](https://github.com/zmerlynn) in [https://github.com/googleforgames/agones/pull/3837](https://github.com/googleforgames/agones/pull/3837) - Instructions for upgrading golang version by [@​gongmax](https://github.com/gongmax) in [https://github.com/googleforgames/agones/pull/3819](https://github.com/googleforgames/agones/pull/3819) - Remove unused function FindGameServerContainer by [@​zmerlynn](https://github.com/zmerlynn) in [https://github.com/googleforgames/agones/pull/3841](https://github.com/googleforgames/agones/pull/3841) - Adds Unreal to the List of URL Links to Not Check by [@​igooch](https://github.com/igooch) in [https://github.com/googleforgames/agones/pull/3847](https://github.com/googleforgames/agones/pull/3847) - docs: clarify virtualization setup for Windows versions by [@​andresromerodev](https://github.com/andresromerodev) in [https://github.com/googleforgames/agones/pull/3850](https://github.com/googleforgames/agones/pull/3850) **New Contributors:** - [@​skmpf](https://github.com/skmpf) made their first contribution in [https://github.com/googleforgames/agones/pull/3808](https://github.com/googleforgames/agones/pull/3808) - [@​yonbh](https://github.com/yonbh) made their first contribution in [https://github.com/googleforgames/agones/pull/3818](https://github.com/googleforgames/agones/pull/3818) - [@​peterzhongyi](https://github.com/peterzhongyi) made their first contribution in [https://github.com/googleforgames/agones/pull/3844](https://github.com/googleforgames/agones/pull/3844) - [@​andresromerodev](https://github.com/andresromerodev) made their first contribution in [https://github.com/googleforgames/agones/pull/3850](https://github.com/googleforgames/agones/pull/3850) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4zOTAuMCIsInVwZGF0ZWRJblZlciI6IjM3LjM5MC4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9oZWxtIiwidHlwZS9taW5vciJdfQ==-->
What type of PR is this?
What this PR does / Why we need it:
This PR fixes issues with the fleet rollout strategy that have been observed in production. It seems in clusters that are massively allocation heavy, rollout does not behave as expected or as documented. These patches have been confirmed to work in out production environment, using a custom build.
Each issue is in a separate commit, to allow them to be evaluated individually.
Which issue(s) this PR fixes:
Closes #3688
Special notes for your reviewer:
Additional to the 2 bug fixes (first 2 commits), we observe that once an inactive GSS gets to the point of Allocated == Replicas, it is still possible for it to spawn GSes, if the new GS comes up fast enough and there is enough churn in the controller. Once this point is reached, there is no real reason for the GSS to have Spec Replicas, but rather it would be better for it to get its Spec Replicas set to 0 and be allowed to loose it allocations until it is cleaned up. This patch makes up the last commit.