Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to Kubernetes 1.17 #1824

Closed
34 tasks done
roberthbailey opened this issue Sep 17, 2020 · 11 comments
Closed
34 tasks done

Upgrade to Kubernetes 1.17 #1824

roberthbailey opened this issue Sep 17, 2020 · 11 comments
Labels
kind/breaking Breaking change kind/feature New features for Agones
Milestone

Comments

@roberthbailey
Copy link
Member

roberthbailey commented Sep 17, 2020

We aren't ready for this yet, but now that we are wrapping up the upgrade to 1.16 we have some steps we want to proactively capture for 1.17 so it's time to start writing them down.

List of items to do for upgrading to 1.17 (this was started from the list in the 1.16 issue and likley need further updates):

  • Update e2e cluster to run against 1.17
    • Recreate cluster with new scripts (after updating the GKE terraform module)
    • Update kubectl in e2e-image/Dockerfile
  • Update prow cluster to use 1.17 (even though we aren't using it yet, we should keep it in sync)
    • Recreate cluster with new scripts
  • Update the dev tooling to create 1.17 clusters
    • GKE
    • Minikube
    • Kind
    • Update kubectl
  • Update terraform submodules
    • GKE
    • Azure
    • EKS
  • Update documentation for creating clusters to 1.17
    • Usage requirements
    • GKE
    • Minikube
    • EKS
    • AKS
  • Update links to k8s documentation
    • examples/fleet.yaml
    • examples/fleetautoscaler.yaml
    • examples/gameserver.yaml
    • site/content/en/docs/Reference/fleet.md
    • site/content/en/docs/Reference/fleetautoscaler.md
    • site/content/en/docs/Reference/gameserver.md
    • site/content/en/docs/Advanced/limiting-resources.md
    • site/assets/templates/crd-doc-config.json
  • Update client-go
  • Regenerate CRD Kubernetes client libraries
  • Move API references from beta to v1 (see the Kubernetes v1.16 Release Notes)
    • CRDs
    • Admission Webhooks
@roberthbailey roberthbailey added the kind/feature New features for Agones label Sep 17, 2020
@markmandel
Copy link
Member

We should target this at our 1.11.0 release - 1.17 is available across all three cloud providers.

@markmandel markmandel pinned this issue Oct 26, 2020
@markmandel
Copy link
Member

Wondering if we want to add running a long running version of #1867 to ensure there are no memory leaks in client-go when we upgrade?

This is the second time (#1871) we've been bitten by a memory leak in client-go.

@roberthbailey
Copy link
Member Author

#1799 is captured as a checkbox in the list above but also in a separate issue.

markmandel added a commit to markmandel/agones that referenced this issue Nov 20, 2020
Needed to update minikube to Kubernetes 1.17.x and I figured I would
also go through the minikube dev experience and update it.

This includes:

* Switch to default to the Docker driver, since everyone should have
  Docker installed.
* Removing the Windows hacks, because they were awful and I feel bad I
  even wrote them in the first place.
* Migrate tooling to use new minikube functionality
* Update minikube commands to up to date release conformity.
* Updated the documentation

Work on googleforgames#1824
@markmandel
Copy link
Member

Current status: Moving the CRDs to v1.

markmandel added a commit that referenced this issue Nov 30, 2020
Needed to update minikube to Kubernetes 1.17.x and I figured I would
also go through the minikube dev experience and update it.

This includes:

* Switch to default to the Docker driver, since everyone should have
  Docker installed.
* Removing the Windows hacks, because they were awful and I feel bad I
  even wrote them in the first place.
* Migrate tooling to use new minikube functionality
* Update minikube commands to up to date release conformity.
* Updated the documentation

Work on #1824
@markmandel
Copy link
Member

Going to tackle upgrading go-client - it likely would need to happen after we get some more CRD v1 stuff in, but I expect we can manage the merge, so I figured I can get the work running in parallel and handle the merging.

markmandel added a commit to markmandel/agones that referenced this issue Dec 2, 2020
This PR updates client-go to 0.17.14 to support Kubernetes 1.17.x.

This also includes a run of `make gen-crd-client`, which produced no
change in the generated code.

Work on googleforgames#1824
markmandel added a commit to markmandel/agones that referenced this issue Dec 7, 2020
markmandel added a commit to markmandel/agones that referenced this issue Dec 7, 2020
markmandel added a commit to markmandel/agones that referenced this issue Dec 8, 2020
Implements two new shortcodes:
* {{% k8s-version %}} - which outputs the currently supported version
* {{% k8s-api href="#podtemplatespec-v1-core" %}} - which outputs the
api reference url to the supported k8s version.

These shortcode utilise the `HUGO_ENV` environment to determine if it
should show the current K8s version, or the next - as it is only set to
the value of "production" when it is generated for the release version
of the agones.dev website.

Also added updates to the release checklist to manage this as well.

Ideally, this will remove lots of busy work of feature shortcoding a lot
of content as we churn through Kubernetes versions.

Long term, we may want to expand this to include separate tools for
current and next full semver versions - i.e. rather than just
1.16 ➡ 1.17, but something like 1.16.3 ➡ 1.17.14.

We cheat a little on this release as there is 1.16.13 and 1.17.13
released on both 1.16 and 1.17.

Work on googleforgames#1824
markmandel added a commit to markmandel/agones that referenced this issue Dec 8, 2020
Implements two new shortcodes:
* {{% k8s-version %}} - which outputs the currently supported version
* {{% k8s-api href="#podtemplatespec-v1-core" %}} - which outputs the
api reference url to the supported k8s version.

These shortcode utilise the `HUGO_ENV` environment to determine if it
should show the current K8s version, or the next - as it is only set to
the value of "production" when it is generated for the release version
of the agones.dev website.

Also added updates to the release checklist to manage this as well.

Ideally, this will remove lots of busy work of feature shortcoding a lot
of content as we churn through Kubernetes versions.

Long term, we may want to expand this to include separate tools for
current and next full semver versions - i.e. rather than just
1.16 ➡ 1.17, but something like 1.16.3 ➡ 1.17.14.

We cheat a little on this release as there is 1.16.13 and 1.17.13
released on both 1.16 and 1.17.

Work on googleforgames#1824
roberthbailey pushed a commit that referenced this issue Dec 8, 2020
roberthbailey added a commit that referenced this issue Dec 8, 2020
This PR updates client-go to 0.17.14 to support Kubernetes 1.17.x.

This also includes a run of `make gen-crd-client`, which produced no
change in the generated code.

Work on #1824

Co-authored-by: Robert Bailey <robertbailey@google.com>
markmandel added a commit to markmandel/agones that referenced this issue Dec 8, 2020
markmandel added a commit to markmandel/agones that referenced this issue Dec 8, 2020
roberthbailey pushed a commit that referenced this issue Dec 8, 2020
markmandel added a commit to markmandel/agones that referenced this issue Dec 8, 2020
Implements two new shortcodes:
* {{% k8s-version %}} - which outputs the currently supported version
* {{% k8s-api href="#podtemplatespec-v1-core" %}} - which outputs the
api reference url to the supported k8s version.

These shortcode utilise the `HUGO_ENV` environment to determine if it
should show the current K8s version, or the next - as it is only set to
the value of "production" when it is generated for the release version
of the agones.dev website.

Also added updates to the release checklist to manage this as well.

Ideally, this will remove lots of busy work of feature shortcoding a lot
of content as we churn through Kubernetes versions.

Long term, we may want to expand this to include separate tools for
current and next full semver versions - i.e. rather than just
1.16 ➡ 1.17, but something like 1.16.3 ➡ 1.17.14.

We cheat a little on this release as there is 1.16.13 and 1.17.13
released on both 1.16 and 1.17.

Work on googleforgames#1824
markmandel added a commit that referenced this issue Dec 8, 2020
* Move supported site K8s version to shortcodes

Implements two new shortcodes:
* {{% k8s-version %}} - which outputs the currently supported version
* {{% k8s-api href="#podtemplatespec-v1-core" %}} - which outputs the
api reference url to the supported k8s version.

These shortcode utilise the `HUGO_ENV` environment to determine if it
should show the current K8s version, or the next - as it is only set to
the value of "production" when it is generated for the release version
of the agones.dev website.

Also added updates to the release checklist to manage this as well.

Ideally, this will remove lots of busy work of feature shortcoding a lot
of content as we churn through Kubernetes versions.

Long term, we may want to expand this to include separate tools for
current and next full semver versions - i.e. rather than just
1.16 ➡ 1.17, but something like 1.16.3 ➡ 1.17.14.

We cheat a little on this release as there is 1.16.13 and 1.17.13
released on both 1.16 and 1.17.

Work on #1824

* Review updates.
markmandel added a commit to markmandel/agones that referenced this issue Dec 9, 2020
This PR removes the K8s API reference from the yaml files and redirects
users to look at the website reference, which has links to those API
docs.

This is to reduce toil as we increment supported Kubernetes version
numbers, as it is now far easier to edit the supported K8s version
throughout the entire website docs through config variables.

Work on googleforgames#1824
markmandel added a commit that referenced this issue Dec 9, 2020
This PR removes the K8s API reference from the yaml files and redirects
users to look at the website reference, which has links to those API
docs.

This is to reduce toil as we increment supported Kubernetes version
numbers, as it is now far easier to edit the supported K8s version
throughout the entire website docs through config variables.

Work on #1824

Co-authored-by: Robert Bailey <robertbailey@google.com>
markmandel added a commit to markmandel/agones that referenced this issue Dec 10, 2020
Updating json config to 1.17 and regenerate the CRD API documentation as
well.

Work on googleforgames#1824
markmandel added a commit to markmandel/agones that referenced this issue Dec 10, 2020
Last bit of upgrading to Kubernetes 1.17

Work on googleforgames#1824
markmandel added a commit that referenced this issue Dec 10, 2020
Updating json config to 1.17 and regenerate the CRD API documentation as
well.

Work on #1824
roberthbailey added a commit that referenced this issue Dec 10, 2020
Last bit of upgrading to Kubernetes 1.17

Work on #1824

Co-authored-by: Robert Bailey <robertbailey@google.com>
@markmandel
Copy link
Member

Ran an overnight allocation load test.

TESTRUNSCOUNT=30 ./runAllocation.sh 40 100

I can see a minor increase in memory over time, but looking at the heap graph, I don't see any egregious memory usage, so I'm expecting that it is likely normal memory usage, that will plateau over time.
image

No CPU leaks.

Heap Graph:
image

Usually when we have a memory leak, it's extremely obvious because it takes up a majority of the memory space.

@markmandel
Copy link
Member

All items have now been checked! Happy to close this issue if everyone else is!

@markmandel
Copy link
Member

markmandel commented Dec 10, 2020

Running some additional allocations over the past few hours - and yeah, looks like it's plateau's around the 200-300 MB mark (on alloc, light blue), so what we're seeing it just a gradual increase to that point. 😅

image

@markmandel
Copy link
Member

Poking at this some more because I wasn't 100% happy - and it seems like it's okay, as it keep coming down to that baseline.

(This is the past 23 hours, which includes 2 load tests)

image

Nothing in the heap aloc graph seems to stands out;

image

I do see:

32MB 14.32%                | k8s.io/client-go/tools/cache.MetaNamespaceKeyFunc /go/src/agones.dev/agones/vendor/k8s.io/client-go/tools/cache/store.go:85

Which was smaller before, but that (a) could be coincidence and (b) there's nothing in that function that actually stores anything, it's all transient String creation.

Maybe I'm just being overtly worrisome. 🤷

@markmandel
Copy link
Member

I'm going to say let's close this issue, since RC is tomorrow. If it looks like something has gone wrong during RC week, we can reopen a new issue. Sound good?

@markmandel
Copy link
Member

Closing, since no objections!

@markmandel markmandel added this to the 1.11.0 milestone Dec 15, 2020
@markmandel markmandel added the kind/breaking Breaking change label Dec 15, 2020
@markmandel markmandel unpinned this issue Dec 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/breaking Breaking change kind/feature New features for Agones
Projects
None yet
Development

No branches or pull requests

2 participants