-
Notifications
You must be signed in to change notification settings - Fork 813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor ReadyGameServerCache to AllocationCache #2148
Refactor ReadyGameServerCache to AllocationCache #2148
Conversation
Build Succeeded 👏 Build Id: ac12682f-3d98-44d7-8828-88aeeacec829 The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
} | ||
|
||
// syncReadyGSServerCache syncs the gameserver cache and updates the local cache for any changes. | ||
func (c *ReadyGameServerCache) syncReadyGSServerCache() error { | ||
// syncCache syncs the gameserver cache and updates the local cache for any changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this function need any locks to prevent concurrent calls from conflicting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't for two reasons:
- the underlying
gameServerCacheEntry
has appropriate locking in place for individual records.2 - The workerqueue is only running a single routine, so this method only ever gets called a single time per invocation.
(Also, all this code has been running in production for ages with no reported issues)
@@ -57,9 +58,9 @@ import ( | |||
) | |||
|
|||
var ( | |||
// ErrNoGameServerReady is returned when there are no Ready GameServers | |||
// ErrNoGameServer is returned when there are no Ready GameServers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"when there are no Allocatable GameServers"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! 👍🏻
} | ||
|
||
// add last allocated, so it always gets updated, even if it is already Allocated | ||
gs.ObjectMeta.Annotations["agones.dev/last-allocated"] = time.Now().String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agones.dev/last-allocated
should be a constant instead of a magic string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit nit picky, but do we care about timezones at all for this timestamp? This will apply the time in the current timezone of the controller process, which could have some exciting behavior during daylight savings adjustments (where time goes backwards).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Format of string is: "2006-01-02 15:04:05.999999999 -0700 MST", which includes the timezone, so I think we should be good there for readability.
For the sake of the purposes we need it for allocation, we actually only care that the value changes, so that the object gets updated within K8s, without a field that has changed, if we update the GameServer
with no differences from what is stored in etcd, it becomes a noop, which means it can't be watched for by the SDK or other services.
k := k | ||
v := v | ||
t.Run(k, func(t *testing.T) { | ||
// deliberately not resetting the Feature state, to catch any possible unknown regressions with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the side effect of this to run (or worse, sometimes run) the old tests with the features enabled? I think if we want to ensure the old behavior works with the features enabled, it's simpler to just add a new test case exactly like the old one but with the feature turned on. Then we are deliberately testing the code path with and without the feature enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes (admittedly this is a movement of an old test, as is much of this file) - since we don't know what features will be enabled/disabled across the whole set of features, we rely on go's random ordering of tests to find issues with combinations of feature flags over time.
Testing for every combination of feature flags everywhere for past and future flags is a combinatorial explosion, and probably not worth the effort (unless you have a clever way of doing this that I can't think of - in which case I'm all ears!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to test every combination everywhere, but it places where we have two explicit code paths and it's easy to add coverage (by adding a new table driven test entry) it seems like not too much work to ensure that all test runs do both old and new.
}, | ||
} | ||
|
||
for k, v := range fixtures { | ||
t.Run(k, func(t *testing.T) { | ||
runtime.FeatureTestMutex.Lock() | ||
defer runtime.FeatureTestMutex.Unlock() | ||
// deliberately not resetting the Feature state, to catch any possible unknown regressions with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't make sense to me, since the tests are setting the feature flags to false, which should be exercising the old behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yep, I see how this would look weird - apologies.
I have more tests coming for this test as the this feature continues to grow, which use different feature flags, so I just copied it across. Happy to scrap the individual setting of feature flags these features to false for overall, and then adjust it as needed down the line.
That being said, reviewing all the tests I have, and plan to have for this - I never don't set a feature flag, so I'll make it required for this unit test. Let me know what you think.
assert.Equal(t, "bar", gs.ObjectMeta.Labels["foo"]) | ||
assert.NotNil(t, gs.ObjectMeta.Annotations["agones.dev/last-allocated"]) | ||
|
||
gs, err = allocator.applyAllocationToGameServer(ctx, allocationv1.MetaPatch{Annotations: map[string]string{"foo": "bar"}}, &agonesv1.GameServer{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does it test to run this call again with the same parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know, I have no idea 😄 removed!
@@ -60,6 +60,11 @@ const ( | |||
func TestControllerAllocator(t *testing.T) { | |||
t.Parallel() | |||
|
|||
// TODO:(markmandel) remove once GameServerSelector is in place, since right now enabling the feature flag causes flaky tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is a bit scary. Does it make the tests flaky because they sometime run with the feature flag on and sometimes with it off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is understandably scary on review 😃 - let me adjust it so it is less scary.
TL;DR: once all the code is in place the test will pass with any feature flag enabled/disabled - but not yet.
A bit of a chunky PR, but it does several things that were tied together. 1. Since the cache will need to track both Ready and Allocated GameServers, renamed everything to be "Allocation" related, rather than just be about Ready GameServers 2. Implement functionality to be able to cache either Ready (for the stable installation), or Allocated + Ready Gameservers for when the Feature Flag of FeatureStateAllocationFilter is enabled. 3. There were some functionality and tests in some strange places, so took the opportunity to clean that up, since I had to refactor a bunch of stuff anyway. 4. Made sure tests would continue to pass in this interim state. Work on googleforgames#1239
87c4a04
to
103ada5
Compare
Review updates in place, PTAL! |
Build Succeeded 👏 Build Id: 69c8e0d6-6085-46eb-894f-3acb4ee82eb8 The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: markmandel, roberthbailey The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
New changes are detected. LGTM label has been removed. |
Build Succeeded 👏 Build Id: 9d2debcd-5039-4916-8d03-cae48aac3b8b The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
Minor fix from a PR comment that I had in the backlog but hadn't submitted yet. Fix for googleforgames#2148 (comment)
Update the proto for allocation and related converters to allow advanced allocation to also occur through the Allocation endpoint, and not just the `GameServerAllocation` CRD. This has been tested for backward compatability with the previous proto version (ran allocation e2e from `main` branch against an install of this PR). The only outstanding item for googleforgames#2148 is updates to the documentation! Work on googleforgames#2148
Minor fix from a PR comment that I had in the backlog but hadn't submitted yet. Fix for #2148 (comment)
Update the proto for allocation and related converters to allow advanced allocation to also occur through the Allocation endpoint, and not just the `GameServerAllocation` CRD. This has been tested for backward compatability with the previous proto version (ran allocation e2e from `main` branch against an install of this PR). The only outstanding item for googleforgames#2148 is updates to the documentation! Work on googleforgames#2148
* Update proto and allocator for advanced allocation Update the proto for allocation and related converters to allow advanced allocation to also occur through the Allocation endpoint, and not just the `GameServerAllocation` CRD. This has been tested for backward compatability with the previous proto version (ran allocation e2e from `main` branch against an install of this PR). The only outstanding item for #2148 is updates to the documentation! Work on #2148 * Review updates. Co-authored-by: Robert Bailey <robertbailey@google.com>
What type of PR is this?
/kind feature
What this PR does / Why we need it:
A bit of a chunky PR, but it does several things that were tied together.
Which issue(s) this PR fixes:
Work on #1239
Special notes for your reviewer:
The design on #1239 has it adding an annotation to the GameServer when allocating. I didn't put this behind the feature branch, because I didn't think it changed anything or was a risk. But please let me know if you think it should be behind the feature branch.