Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update client-go to support Kubernetes 1.18.0 #1998

Merged
merged 5 commits into from
Feb 25, 2021

Conversation

markmandel
Copy link
Collaborator

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespace from that line:

/kind breaking

/kind bug
/kind cleanup
/kind documentation
/kind feature
/kind hotfix

What this PR does / Why we need it:

This PR covers:

  1. Update vendored client-go to support Kubernets 1.18.0
  2. Updated CRD clients to support Kubernetes 1.18.0
  3. e2e tests have now been refactored to match the new client-go API surface.
  4. Switch from using a stop channel to a context.Context throughout the codebase.
  5. Use this new Context to accommodate the new client-go API surface and generated clients.

These changes are due to the breaking change that client-go now required a context.Context as well as an Option struct on all API requests.

The vast majority of this commit is implementing the new api surface, but areas to highlight for functionality changes:

  • /pkg/util/ folder, especially the workerqueue implementation
  • /cmd/sdk-server as the new context.Context simplified the timeout functionality.

Which issue(s) this PR fixes:

Work on #1971

Special notes for your reviewer:

I AM SO SORRY FOR THIS PR 😭. It's massive, and unwieldy, but I couldn't see any way to do this in pieces, as the changes in client-go touch everything.

Please check the commit comments, I tried to break it apart by commits, so that it would be easier to review.

@markmandel markmandel added kind/feature New features for Agones kind/cleanup Refactoring code, fixing up documentation, etc kind/breaking Breaking change labels Feb 18, 2021
@google-cla google-cla bot added the cla: yes label Feb 18, 2021
@markmandel markmandel force-pushed the upgrade/client-go-1.18.x branch from 3869494 to ca7f411 Compare February 18, 2021 19:25
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 66932ce7-89d2-48af-905c-d01364034762

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Collaborator Author

--- FAIL: TestRestAllocator (2.73s)
    allocator_test.go:115: updating secrets failed: Operation cannot be fulfilled on secrets "allocator-tls": the object has been modified; please apply your changes to the latest version and try again
time="2021-02-18 19:36:08.339" level=info msg="GameServer created, waiting for Ready" gs=game-serverv5mrx
--- FAIL: TestAllocator (3.28s)

Passed locally, looks like a flake that failed on the secret update (that's a new one):

if _, err := kubeCore.Secrets(agonesSystemNamespace).Update(ctx, s, metav1.UpdateOptions{}); err != nil {
		t.Fatalf("updating secrets failed: %s", err)
	}
``

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 101d9a53-0474-4c55-975d-e08b7ae72f9d

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Collaborator Author

--- FAIL: TestAllocator (2.25s)
    allocator_test.go:66: updating secrets failed: Operation cannot be fulfilled on secrets "allocator-tls": the object has been modified; please apply your changes to the latest version and try again
time="2021-02-18 20:34:26.697" level=info msg="waiting for fleet condition" fleet=simple-fleet-l62jd
--- FAIL: TestRestAllocator (2.77s)
    allocator_test.go:115: updating secrets failed: Operation cannot be fulfilled on secrets "allocator-tls": the object has been modified; please apply your changes to the latest version and try again

Oh same thing - seems like the new client-go is causing this. I can fix it though.

@markmandel
Copy link
Collaborator Author

Oh I know what it is, I made the tests parallel.

@markmandel markmandel changed the title Update client-go to support Kubernets 1.18.0 Update client-go to support Kubernetes 1.18.0 Feb 18, 2021
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: cddc8827-8c6a-40e4-8dc8-fdfdaf7327fd

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel markmandel force-pushed the upgrade/client-go-1.18.x branch from dad0158 to 4307f90 Compare February 18, 2021 21:36
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 555807a0-9693-40c9-8090-69dba1070cc7

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Collaborator Author

--- FAIL: TestGameServerReadyAllocateReady (157.67s)
    gameserver_test.go:507: 
        	Error Trace:	gameserver_test.go:507
        	Error:      	Received unexpected error:
        	            	timed out waiting for the condition
        	            	waiting for GameServer to be Ready 1613684842/game-servercfkcj
        	            	agones.dev/agones/test/e2e/framework.(*Framework).WaitForGameServerStateWithLogger
        	            		/go/src/agones.dev/agones/test/e2e/framework/framework.go:260
        	            	agones.dev/agones/test/e2e/framework.(*Framework).WaitForGameServerState
        	            		/go/src/agones.dev/agones/test/e2e/framework/framework.go:267
        	            	agones.dev/agones/test/e2e.TestGameServerReadyAllocateReady
        	            		/go/src/agones.dev/agones/test/e2e/gameserver_test.go:506
        	            	testing.tRunner
        	            		/usr/local/go/src/testing/testing.go:1050
        	            	runtime.goexit
        	            		/usr/local/go/src/runtime/asm_amd64.s:1373
        	Test:       	TestGameServerReadyAllocateReady

🤔

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 59b0c456-9534-4b5a-bde4-4a9d04114c6d

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Collaborator Author

--- FAIL: TestGameServerReadyAllocateReady (153.95s)
    gameserver_test.go:507: 
        	Error Trace:	gameserver_test.go:507
        	Error:      	Received unexpected error:
        	            	timed out waiting for the condition
        	            	waiting for GameServer to be Ready 1613687213/game-serverbqgjz
        	            	agones.dev/agones/test/e2e/framework.(*Framework).WaitForGameServerStateWithLogger
        	            		/go/src/agones.dev/agones/test/e2e/framework/framework.go:260
        	            	agones.dev/agones/test/e2e/framework.(*Framework).WaitForGameServerState
        	            		/go/src/agones.dev/agones/test/e2e/framework/framework.go:267
        	            	agones.dev/agones/test/e2e.TestGameServerReadyAllocateReady
        	            		/go/src/agones.dev/agones/test/e2e/gameserver_test.go:506
        	            	testing.tRunner
        	            		/usr/local/go/src/testing/testing.go:1050
        	            	runtime.goexit
        	            		/usr/local/go/src/runtime/asm_amd64.s:1373
        	Test:       	TestGameServerReadyAllocateReady

Hrmn. Seems consistent in e2e tests.

@markmandel markmandel force-pushed the upgrade/client-go-1.18.x branch from 4307f90 to 25da474 Compare February 18, 2021 23:05
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 4d0bb1c0-b516-45ad-a7fe-b0d5f12cab83

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: f1712374-6324-4924-86f4-73e849e0312e

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Collaborator Author

Running into every flake and issue ever apparently today.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 8b993311-89ab-4cbe-a378-e3646554bf44

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Collaborator Author

Looks like SDK conformance tests are failing reasonably(?) consistently. Will attempt to replicate locally.

@markmandel
Copy link
Collaborator Author

Ah. Small problem. If you pass --timeout to the local SDK, it cancel's immeadiately. 🤦‍♂️

@markmandel markmandel force-pushed the upgrade/client-go-1.18.x branch from 25da474 to 5138e05 Compare February 19, 2021 02:48
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 01d9f28b-d297-44f9-8aa5-7bf7b09abca9

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel markmandel force-pushed the upgrade/client-go-1.18.x branch from 5138e05 to b8b5d29 Compare February 19, 2021 03:04
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: cd48ce89-7fc5-48f9-9e54-caa7b95e00d0

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/1998/head:pr_1998 && git checkout pr_1998
  • helm install ./install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.13.0-b8b5d29

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 4a92bf04-2551-4251-ba21-19deb83db330

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/1998/head:pr_1998 && git checkout pr_1998
  • helm install ./install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.13.0-49cf193

@markmandel
Copy link
Collaborator Author

I just realised I didn't add a note - but this is good to review now. I worked out the failures.

Copy link
Contributor

@pooneh-m pooneh-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done on making this humongous change. 8)

if err := allocator.Start(stop); err != nil {
kubeInformerFactory.Start(ctx.Done())
agonesInformerFactory.Start(ctx.Done())
if err := allocator.Start(ctx); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctx.Done?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the allocator has to go all the way down to client-go to allocate a GameServer, we have to pass the ctx all the way through.

And it also means the API for all systems is the same all the way through.

It's also possible that allocation.Start(...) should be changed to allocation.Run(...) to match all the other modules in Agones, hence the weirdness in the API surface, so I totally get the comment.

time.Sleep(time.Duration(ctlConf.Timeout) * time.Second)
close(timedStop)
}()
ctx, cancel = context.WithTimeout(ctx, time.Duration(ctlConf.Timeout)*time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked this one too 😄

@google-oss-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: markmandel, pooneh-m

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [markmandel,pooneh-m]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

client-go and related libraries moved to version v0.18.15

This does not include a regeneration of Agones CRD clients.
This includes the breaking change that client-go now
required a context.Context as well as an Option struct
on all API requests.
e2e tests have now been refactored to match the new client-go
API surface.
This commit does two things:
1. Switch from using a `stop` channel to a context.Context throughout
the codebase.
2. Use this new Context to accommodate the new client-go API surface and
generated clients.

The vast majority of this commit is implementing the
new api surface, but areas to highlight for functionality changes:
- /pkg/util/ folder, especially the workerqueue implementation
- /cmd/sdk-server as the new context.Context simplified the timeout
  functionality.

Work on googleforgames#1971
- Allocation tests need to be serial, so they can
  refresh the client/server certificate.
- Just in case, implement retry on creating the secret.
- Fixes for flaky TestGameServerReadyAllocateReady
- Fix bug in sdkserver local timeout.
@markmandel markmandel force-pushed the upgrade/client-go-1.18.x branch from 49cf193 to 6f2aa50 Compare February 25, 2021 04:24
@google-oss-robot
Copy link

New changes are detected. LGTM label has been removed.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 3d6ae0c5-299a-4eec-8cc6-9b9b8075955d

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Collaborator Author

--- FAIL: TestGameServerAllocationMetaDataPatch (93.83s)
    gameserverallocation_test.go:288: 
        	Error Trace:	gameserverallocation_test.go:288
        	Error:      	timed out waiting for the condition
        	Test:       	TestGameServerAllocationMetaDataPatch

That's a new flake.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 086e95ba-0fb6-4c65-8c3b-ed6df40e2400

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Collaborator Author

That's a new flake too

--- FAIL: TestGameServerAllocationPreferredSelection (181.52s)
    gameserverallocation_test.go:386: 
        	Error Trace:	gameserverallocation_test.go:386
        	Error:      	Not equal: 
        	            	expected: "preferred-g6pt7"
        	            	actual  : "required-wxfkv"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1 +1 @@
        	            	-preferred-g6pt7
        	            	+required-wxfkv
        	Test:       	TestGameServerAllocationPreferredSelection

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 5fd5a676-9a4a-4043-ae45-58ab6a597798

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/1998/head:pr_1998 && git checkout pr_1998
  • helm install ./install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.13.0-6f2aa50

@markmandel markmandel merged commit 59a74c4 into googleforgames:main Feb 25, 2021
@markmandel markmandel deleted the upgrade/client-go-1.18.x branch February 25, 2021 18:01
@markmandel markmandel added this to the 1.13.0 milestone Feb 25, 2021
@markmandel markmandel restored the upgrade/client-go-1.18.x branch February 26, 2021 00:17
@markmandel markmandel deleted the upgrade/client-go-1.18.x branch February 26, 2021 00:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved cla: yes kind/breaking Breaking change kind/cleanup Refactoring code, fixing up documentation, etc kind/feature New features for Agones size/XXL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants