GCE: Implement kube-env caching #6531

BigDarkClown · 2024-02-15T18:32:40Z

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

Caching of kube-env for instance templates. The change greatly improves the CPU usage for idle clusters, as ~80% of the main loop time is spend on repetitive unmarshalling of kube-env.

The improvement is drastic for idle clusters even with only 3 MIGs. In my testing the loop time decreased from 1.25s to 0.25s.

atwamahmoud

In General, LGTM
However I might be messing some stuff so it'd be safer if someone else took a quick look

k8s-ci-robot · 2024-02-16T12:57:08Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: atwamahmoud, BigDarkClown

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~cluster-autoscaler/cloudprovider/gce/OWNERS~~ [BigDarkClown]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

atwamahmoud · 2024-02-19T16:29:01Z

/lgtm

It would be better to omit the Get from function names but it's out of scope for this PR since we'll have to rename not just added functions but older ones to remain consistent

Edit: Ahh, I forgot I'm not in the OWNERS file passing to @jayantjain93

k8s-ci-robot · 2024-02-19T16:29:05Z

@atwamahmoud: changing LGTM is restricted to collaborators

In response to this:

/lgtm

It would be better to omit the Get from function names but it's out of scope for this PR since we'll have to rename not just added functions but older ones to remain consistent

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

x13n · 2024-02-20T08:41:54Z

cluster-autoscaler/cloudprovider/gce/cache.go

+}
+
+// InvalidateAllMigKubeEnvs clears the kube-env cache
+func (gc *GceCache) InvalidateAllMigKubeEnvs() {


Will we ever need either Invalidate ? Since instance templates are immutable, we probably won't ever call this. What we might want instead though is some expiration mechanism - to avoid memleaks in clusters with high MIG churn. This is a nice to have though, I don't expect single CA instance to observe enough instance templates over entire process lifetime to visibly inflate memory usage.

x13n · 2024-02-20T09:18:44Z

I left a comment, but that is something to follow up on later, not in this PR, so let's merge as is.

/lgtm

Btw, @atwamahmoud - you don't need to be an OWNER to /lgtm, you just need to be a Kubernetes org member.

k8s-ci-robot requested review from jayantjain93 and MaciekPytel February 15, 2024 18:32

k8s-ci-robot added area/cluster-autoscaler area/provider/gce approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 15, 2024

BigDarkClown force-pushed the kube-env branch from 7ea55df to de085c9 Compare February 15, 2024 18:53

BigDarkClown changed the title ~~WIP: Kube env~~ GCE: Implement kube-env caching Feb 15, 2024

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 15, 2024

BigDarkClown force-pushed the kube-env branch from de085c9 to 1990b4e Compare February 15, 2024 19:01

Add kube-env to MigInfoProvider

df02299

BigDarkClown force-pushed the kube-env branch from 1990b4e to e81c27b Compare February 15, 2024 19:04

Use KubeEnv in gce/templates.go

241936f

atwamahmoud approved these changes Feb 16, 2024

View reviewed changes

BigDarkClown changed the title ~~GCE: Implement kube-env caching~~ WIP: GCE: Implement kube-env caching Feb 16, 2024

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 16, 2024

BigDarkClown added 2 commits February 16, 2024 13:09

Add templateName to kube-env to ensure that correct value is cached

42aa9a1

Add unit-tests

760b2b5

BigDarkClown force-pushed the kube-env branch from e81c27b to 760b2b5 Compare February 16, 2024 14:20

BigDarkClown changed the title ~~WIP: GCE: Implement kube-env caching~~ GCE: Implement kube-env caching Feb 16, 2024

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 16, 2024

x13n reviewed Feb 20, 2024

View reviewed changes

k8s-ci-robot assigned x13n Feb 20, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 20, 2024

k8s-ci-robot merged commit 2c2ec59 into kubernetes:master Feb 20, 2024
6 checks passed

BigDarkClown mentioned this pull request Feb 27, 2024

Regional instance #6570

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GCE: Implement kube-env caching #6531

GCE: Implement kube-env caching #6531

BigDarkClown commented Feb 15, 2024 •

edited

Loading

atwamahmoud left a comment

k8s-ci-robot commented Feb 16, 2024

atwamahmoud commented Feb 19, 2024 •

edited

Loading

k8s-ci-robot commented Feb 19, 2024

x13n Feb 20, 2024

x13n commented Feb 20, 2024

GCE: Implement kube-env caching #6531

GCE: Implement kube-env caching #6531

Conversation

BigDarkClown commented Feb 15, 2024 • edited Loading

What type of PR is this?

What this PR does / why we need it:

atwamahmoud left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Feb 16, 2024

atwamahmoud commented Feb 19, 2024 • edited Loading

k8s-ci-robot commented Feb 19, 2024

x13n Feb 20, 2024

Choose a reason for hiding this comment

x13n commented Feb 20, 2024

BigDarkClown commented Feb 15, 2024 •

edited

Loading

atwamahmoud commented Feb 19, 2024 •

edited

Loading