-
Notifications
You must be signed in to change notification settings - Fork 813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimise GameServer Sub-Controller Queues #3781
Optimise GameServer Sub-Controller Queues #3781
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
total nit on a log line but o/w LGTM
Build Failed 😱 Build Id: 3ee3fd37-5b52-4519-93dd-24db3d90c475 To get permission to view the Cloud Build view, join the agones-discuss Google Group. |
Aaah, didn't have enough time to become leader!
|
Build Succeeded 👏 Build Id: 19c72705-1899-45c0-9eb1-96d46abfe65f The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
Looking at pprof profiles of the Agones Controller, we see a lot memory in the Migration controller and the Missing Controller queues. This implements improved checking at the event layer, before placing items in the workerqueue, which drops the memory footprint, and also allows for faster processing of the edge cases that Missing and Migration Controllers implement. Also did a review of the Health Controller and made a small improvement there as well. Looking at the memory pprof profile and load tests, we see a decrease in memory usage, and also less of a slow "step up" of memory over time. Closes googleforgames#3748
13d2be1
to
5df2052
Compare
Build Succeeded 👏 Build Id: 39adee09-897b-4873-8465-f1d098eb616d The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
This PR contains the following updates: | Package | Update | Change | |---|---|---| | [agones](https://agones.dev) ([source](https://github.com/googleforgames/agones)) | minor | `1.39.0` -> `1.40.0` | --- > [!WARNING] > Some dependencies could not be looked up. Check the Dependency Dashboard for more information. --- ### Release Notes <details> <summary>googleforgames/agones (agones)</summary> ### [`v1.40.0`](https://github.com/googleforgames/agones/blob/HEAD/CHANGELOG.md#v1400-2024-04-23) [Compare Source](https://github.com/googleforgames/agones/compare/v1.39.0...v1.40.0) [Full Changelog](https://github.com/googleforgames/agones/compare/v1.39.0...v1.40.0) **Breaking changes:** - Counters and Lists: Remove Bool Returns by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3738](https://github.com/googleforgames/agones/pull/3738) **Implemented enhancements:** - Leader Election in Custom Controller by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3696](https://github.com/googleforgames/agones/pull/3696) - Migrating from generate-groups.sh to kube_codegen.sh by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3722](https://github.com/googleforgames/agones/pull/3722) - Move GKEAutopilotExtendedDurationPods to Alpha in 1.28+ by [@​zmerlynn](https://github.com/zmerlynn) in [https://github.com/googleforgames/agones/pull/3729](https://github.com/googleforgames/agones/pull/3729) - Move DisableResyncOnSDKServer to Beta by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3732](https://github.com/googleforgames/agones/pull/3732) - Counters & Lists landing page and doc improvements by [@​markmandel](https://github.com/markmandel) in [https://github.com/googleforgames/agones/pull/3649](https://github.com/googleforgames/agones/pull/3649) - Graduate FleetAllocationOverflow to Stable by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3733](https://github.com/googleforgames/agones/pull/3733) - Adds Counters and Lists to CSharp SDK by [@​igooch](https://github.com/igooch) in [https://github.com/googleforgames/agones/pull/3581](https://github.com/googleforgames/agones/pull/3581) - Feat/counter and list defaulting order to ascending by [@​lacroixthomas](https://github.com/lacroixthomas) in [https://github.com/googleforgames/agones/pull/3734](https://github.com/googleforgames/agones/pull/3734) - Add handling for StatusAddresses in GameServerStatus for the Unity SDK by [@​charlesvien](https://github.com/charlesvien) in [https://github.com/googleforgames/agones/pull/3739](https://github.com/googleforgames/agones/pull/3739) - Feat(gameservers): Shared pod IPs with GameServer Addresses by [@​lacroixthomas](https://github.com/lacroixthomas) in [https://github.com/googleforgames/agones/pull/3764](https://github.com/googleforgames/agones/pull/3764) - Be prescriptive about rotating regions when updating Kubernetes versions by [@​zmerlynn](https://github.com/zmerlynn) in [https://github.com/googleforgames/agones/pull/3716](https://github.com/googleforgames/agones/pull/3716) - Fix ensure-e2e-infra-state-bucket by [@​zmerlynn](https://github.com/zmerlynn) in [https://github.com/googleforgames/agones/pull/3719](https://github.com/googleforgames/agones/pull/3719) - Create Performance Cluster 1.28 by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3720](https://github.com/googleforgames/agones/pull/3720) - Optimise GameServer Sub-Controller Queues by [@​markmandel](https://github.com/markmandel) in [https://github.com/googleforgames/agones/pull/3781](https://github.com/googleforgames/agones/pull/3781) **Fixed bugs:** - Counters & Lists: Consolidate `priorities` sorting by [@​markmandel](https://github.com/markmandel) in [https://github.com/googleforgames/agones/pull/3690](https://github.com/googleforgames/agones/pull/3690) - Fix(Counter & Lists): Add validation for `priorities` by [@​lacroixthomas](https://github.com/lacroixthomas) in [https://github.com/googleforgames/agones/pull/3714](https://github.com/googleforgames/agones/pull/3714) - fix: [#​3607](https://github.com/googleforgames/agones/issues/3607) Metrics data loss in K8S controller by [@​alvin-7](https://github.com/alvin-7) in [https://github.com/googleforgames/agones/pull/3692](https://github.com/googleforgames/agones/pull/3692) - Deflake GameServerAllocationDuringMultipleAllocationClients by allowing errors by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3750](https://github.com/googleforgames/agones/pull/3750) **Security fixes:** - Bump protobufjs from 7.2.4 to 7.2.6 in /sdks/nodejs by [@​dependabot](https://github.com/dependabot) in [https://github.com/googleforgames/agones/pull/3755](https://github.com/googleforgames/agones/pull/3755) - Bump golang.org/x/net from 0.19.0 to 0.23.0 by [@​zmerlynn](https://github.com/zmerlynn) in [https://github.com/googleforgames/agones/pull/3793](https://github.com/googleforgames/agones/pull/3793) **Other:** - Flaky: TestGameServerCreationAfterDeletingOneExtensionsPod by [@​markmandel](https://github.com/markmandel) in [https://github.com/googleforgames/agones/pull/3699](https://github.com/googleforgames/agones/pull/3699) - Prep for release v1.40.0 by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3700](https://github.com/googleforgames/agones/pull/3700) - Bumps cpp-simple Image and Refactoring Example Makefiles by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3695](https://github.com/googleforgames/agones/pull/3695) - Upgrade Protobuf to 1.33.0 by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3711](https://github.com/googleforgames/agones/pull/3711) - Modify Script for Makefile Version Updates in Examples Directory by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3712](https://github.com/googleforgames/agones/pull/3712) - Adds simple genai server example documentation to the Agones site by [@​igooch](https://github.com/igooch) in [https://github.com/googleforgames/agones/pull/3713](https://github.com/googleforgames/agones/pull/3713) - Update Supported Kubernetes to 1.27, 1.28, 1.29 by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3654](https://github.com/googleforgames/agones/pull/3654) - fix: typo in docs by [@​qhyun2](https://github.com/qhyun2) in [https://github.com/googleforgames/agones/pull/3723](https://github.com/googleforgames/agones/pull/3723) - Tweak: Setting up the Game Server by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3717](https://github.com/googleforgames/agones/pull/3717) - Docs: gke.md - spelling by [@​daniellee](https://github.com/daniellee) in [https://github.com/googleforgames/agones/pull/3740](https://github.com/googleforgames/agones/pull/3740) - Aesthetic rearrangement of cloudbuild.yaml by [@​zmerlynn](https://github.com/zmerlynn) in [https://github.com/googleforgames/agones/pull/3741](https://github.com/googleforgames/agones/pull/3741) - Docs: Make hitting <enter> on connection explicit by [@​markmandel](https://github.com/markmandel) in [https://github.com/googleforgames/agones/pull/3743](https://github.com/googleforgames/agones/pull/3743) - CI: Don't check Unreal Link by [@​markmandel](https://github.com/markmandel) in [https://github.com/googleforgames/agones/pull/3745](https://github.com/googleforgames/agones/pull/3745) - New recommendation for multi-cluster allocation by [@​markmandel](https://github.com/markmandel) in [https://github.com/googleforgames/agones/pull/3744](https://github.com/googleforgames/agones/pull/3744) - Custom Controller Example Page on Agones Website by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3725](https://github.com/googleforgames/agones/pull/3725) - Add Nitrado logo by [@​towolf](https://github.com/towolf) in [https://github.com/googleforgames/agones/pull/3753](https://github.com/googleforgames/agones/pull/3753) - Remove unnecessary args from e2e-test-cloudbuild by [@​zmerlynn](https://github.com/zmerlynn) in [https://github.com/googleforgames/agones/pull/3754](https://github.com/googleforgames/agones/pull/3754) - Update Allocation from Fleet Documentation by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3761](https://github.com/googleforgames/agones/pull/3761) - Transform Lint Warnings into Errors by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3756](https://github.com/googleforgames/agones/pull/3756) - Update Canary Testing Documentation by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3760](https://github.com/googleforgames/agones/pull/3760) - Supertuxkart Example on Agones Site by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3728](https://github.com/googleforgames/agones/pull/3728) - Xonotic Example on Agones Site by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3742](https://github.com/googleforgames/agones/pull/3742) - nit documentation fix in kind cluster section when building Agones by [@​vicentefb](https://github.com/vicentefb) in [https://github.com/googleforgames/agones/pull/3770](https://github.com/googleforgames/agones/pull/3770) - Merged steps inside documentation about webhook certificate creation by [@​vicentefb](https://github.com/vicentefb) in [https://github.com/googleforgames/agones/pull/3768](https://github.com/googleforgames/agones/pull/3768) - Example Images: Increment Tags by [@​Kalaiselvi84](https://github.com/Kalaiselvi84) in [https://github.com/googleforgames/agones/pull/3796](https://github.com/googleforgames/agones/pull/3796) - Update simple game server example documentation by [@​vicentefb](https://github.com/vicentefb) in [https://github.com/googleforgames/agones/pull/3776](https://github.com/googleforgames/agones/pull/3776) **New Contributors:** - [@​lacroixthomas](https://github.com/lacroixthomas) made their first contribution in [https://github.com/googleforgames/agones/pull/3714](https://github.com/googleforgames/agones/pull/3714) - [@​daniellee](https://github.com/daniellee) made their first contribution in [https://github.com/googleforgames/agones/pull/3740](https://github.com/googleforgames/agones/pull/3740) - [@​charlesvien](https://github.com/charlesvien) made their first contribution in [https://github.com/googleforgames/agones/pull/3739](https://github.com/googleforgames/agones/pull/3739) - [@​vicentefb](https://github.com/vicentefb) made their first contribution in [https://github.com/googleforgames/agones/pull/3770](https://github.com/googleforgames/agones/pull/3770) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4zNTYuMSIsInVwZGF0ZWRJblZlciI6IjM3LjM1Ni4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9oZWxtIiwidHlwZS9taW5vciJdfQ==-->
What type of PR is this?
/kind bug
What this PR does / Why we need it:
Looking at pprof profiles of the Agones Controller, we see a lot memory in the Migration controller and the Missing Controller queues.
This implements improved checking at the event layer, before placing items in the workerqueue, which drops the memory footprint, and also allows for faster processing of the edge cases that Missing and Migration Controllers implement.
Also did a review of the Health Controller and made a small improvement there as well.
Looking at the memory pprof profile and load tests, we see a decrease in memory usage, and also less of a slow "step up" of memory over time.
Which issue(s) this PR fixes:
Closes #3748
Special notes for your reviewer:
Here's some pretty pictures!
Here's the two tests run side by side - I only did 30m, but you can see a drop in memory, and less of a "step up" in memory usage you slowly see the beginnings of with the original code
Also updated pprof, it's all basically JSON de/serialisation, which is pretty much what we would want to see.