Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize yurthub kubelet node cache operator #1461

Closed
wants to merge 1 commit into from

Conversation

JameKeal
Copy link
Member

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespace from that line:
/kind bug
/kind documentation
/kind enhancement
/kind good-first-issue
/kind feature
/kind question
/kind design
/sig ai
/sig iot
/sig network
/sig storage

/kind enhancement

What this PR does / why we need it:

In current mode, yurthub has some error when i delete a pod, and the service will find other pods that is in different nodepools:

I0427 07:28:00.977257 1 storage.go:562] key(kubelet/nodes.v1.core/host-fbdb5f) storage is pending, just skip it
W0427 07:28:00.977286 1 handler.go:172] skip serviceTopologyFilterHandler, failed to get current node host-fbdb5f, err: specified key is under accessing

And i add some logs for serviceTopology, it can show us some details:

I0515 08:54:53.202876       1 filter.go:236] -----------, StreamResponseFilter: coredns watch endpoints: https://127.0.0.1:6443/api/v1/endpoints?allowWatchBookmarks=true&resourceVersion=106522&timeout=8m27s&timeoutSeconds=507&watch=true[servicetopology]
I0515 08:54:53.203097       1 filter.go:236] -----------, StreamResponseFilter: nginx-ingress-controller watch endpoints: https://127.0.0.1:6443/api/v1/endpoints?allowWatchBookmarks=true&labelSelector=OWNER%21%3DTILLER&resourceVersion=106522&timeout=7m45s&timeoutSeconds=465&watch=true[servicetopology]
I0515 08:54:53.371902       1 filter.go:236] -----------, StreamResponseFilter: coredns watch endpoints: https://127.0.0.1:6443/api/v1/endpoints?allowWatchBookmarks=true&resourceVersion=106522&timeout=8m27s&timeoutSeconds=507&watch=true[servicetopology]
I0515 08:54:53.374220       1 filter.go:236] -----------, StreamResponseFilter: nginx-ingress-controller watch endpoints: https://127.0.0.1:6443/api/v1/endpoints?allowWatchBookmarks=true&labelSelector=OWNER%21%3DTILLER&resourceVersion=106522&timeout=7m45s&timeoutSeconds=465&watch=true[servicetopology]
I0515 08:54:53.378348       1 filter.go:236] -----------, StreamResponseFilter: kube-proxy watch endpointslices: https://127.0.0.1:6443/apis/discovery.k8s.io/v1beta1/endpointslices?allowWatchBookmarks=true&labelSelector=%21service.kubernetes.io%2Fheadless%2C%21service.kubernetes.io%2Fservice-proxy-name&resourceVersion=95879&timeout=7m15s&timeoutSeconds=435&watch=true[servicetopology]
I0515 08:54:53.381507       1 cache_manager.go:439] pod(kubelet/pods.v1.core/edgex-system/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp) is MODIFIED
I0515 08:54:53.382361       1 util.go:293] start proxying: post /api/v1/namespaces/edgex-system/events, in flight requests: 57
I0515 08:54:53.384234       1 util.go:293] start proxying: get /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp, in flight requests: 58
I0515 08:54:53.399293       1 util.go:252] kubelet get pods: /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp with status code 200, spent 14.95904ms
I0515 08:54:53.404357       1 util.go:252] kubelet create events: /api/v1/namespaces/edgex-system/events with status code 201, spent 21.91904ms
I0515 08:54:53.413732       1 cache_manager.go:439] pod(kubelet/pods.v1.core/edgex-system/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-7q5fw) is ADDED
I0515 08:54:53.418958       1 util.go:293] start proxying: get /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-7q5fw, in flight requests: 57
I0515 08:54:53.431835       1 util.go:252] kubelet get pods: /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-7q5fw with status code 200, spent 12.79584ms
I0515 08:54:53.434748       1 util.go:293] start proxying: patch /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-7q5fw/status, in flight requests: 57
I0515 08:54:53.464499       1 util.go:252] kubelet patch pods: /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-7q5fw/status with status code 200, spent 29.65184ms
I0515 08:54:53.472085       1 cache_manager.go:439] pod(kubelet/pods.v1.core/edgex-system/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-7q5fw) is MODIFIED
I0515 08:54:53.522483       1 filter.go:236] -----------, StreamResponseFilter: nginx-ingress-controller watch endpoints: https://127.0.0.1:6443/api/v1/endpoints?allowWatchBookmarks=true&labelSelector=OWNER%21%3DTILLER&resourceVersion=106522&timeout=7m45s&timeoutSeconds=465&watch=true[servicetopology]
I0515 08:54:53.523581       1 filter.go:236] -----------, StreamResponseFilter: coredns watch endpoints: https://127.0.0.1:6443/api/v1/endpoints?allowWatchBookmarks=true&resourceVersion=106522&timeout=8m27s&timeoutSeconds=507&watch=true[servicetopology]
I0515 08:54:53.610251       1 util.go:293] start proxying: put /apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/host-63b4e4?timeout=10s, in flight requests: 57
I0515 08:54:53.610532       1 util.go:252] kubelet update leases: /apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/host-63b4e4?timeout=10s with status code 200, spent 180.384µs
I0515 08:54:54.860103       1 util.go:293] start proxying: get /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp, in flight requests: 57
I0515 08:54:54.869684       1 util.go:252] kubelet get pods: /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp with status code 200, spent 9.468416ms
I0515 08:54:54.872070       1 util.go:293] start proxying: patch /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp/status, in flight requests: 57
I0515 08:54:54.904470       1 util.go:252] kubelet patch pods: /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp/status with status code 200, spent 32.2888ms
I0515 08:54:54.905318       1 storage.go:562] key(kubelet/pods.v1.core/edgex-system/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp) storage is pending, just skip it
I0515 08:54:54.905354       1 cache_manager.go:652] skip to cache watch event because key({rootKey:false path:kubelet/pods.v1.core/edgex-system/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp}) is under processing
I0515 08:54:54.910532       1 cache_manager.go:439] pod(kubelet/pods.v1.core/edgex-system/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp) is MODIFIED

Why we need to save kubelet node object in cache:
In ServiceTopology and NodePortIsolation filter, handler need to get nodepool information from node,
in regular mode, Get function will get node object in disk, and it need to get mutex,
it will be concurrent sometimes, and return an error for 'specified key is under accessing'.
In the meanwhile, the filter will skip, and unfiltered data will return, it's so terrible for service in multiple nodepools.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

/assign @rambohe-ch

Does this PR introduce a user-facing change?


other Note

@openyurt-bot
Copy link
Collaborator

@JameKeal: GitHub didn't allow me to assign the following users: your_reviewer.

Note that only openyurtio members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespace from that line:
/kind bug
/kind documentation
/kind enhancement
/kind good-first-issue
/kind feature
/kind question
/kind design
/sig ai
/sig iot
/sig network
/sig storage

/kind enhancement

What this PR does / why we need it:

In current mode, yurthub has some error when i delete a pod, and the service will find other pods that is in different nodepools:

I0427 07:28:00.977257 1 storage.go:562] key(kubelet/nodes.v1.core/host-fbdb5f) storage is pending, just skip it
W0427 07:28:00.977286 1 handler.go:172] skip serviceTopologyFilterHandler, failed to get current node host-fbdb5f, err: specified key is under accessing

And i add some logs for serviceTopology, it can show us some details:

I0515 08:54:53.202876       1 filter.go:236] -----------, StreamResponseFilter: coredns watch endpoints: https://127.0.0.1:6443/api/v1/endpoints?allowWatchBookmarks=true&resourceVersion=106522&timeout=8m27s&timeoutSeconds=507&watch=true[servicetopology]
I0515 08:54:53.203097       1 filter.go:236] -----------, StreamResponseFilter: nginx-ingress-controller watch endpoints: https://127.0.0.1:6443/api/v1/endpoints?allowWatchBookmarks=true&labelSelector=OWNER%21%3DTILLER&resourceVersion=106522&timeout=7m45s&timeoutSeconds=465&watch=true[servicetopology]
I0515 08:54:53.371902       1 filter.go:236] -----------, StreamResponseFilter: coredns watch endpoints: https://127.0.0.1:6443/api/v1/endpoints?allowWatchBookmarks=true&resourceVersion=106522&timeout=8m27s&timeoutSeconds=507&watch=true[servicetopology]
I0515 08:54:53.374220       1 filter.go:236] -----------, StreamResponseFilter: nginx-ingress-controller watch endpoints: https://127.0.0.1:6443/api/v1/endpoints?allowWatchBookmarks=true&labelSelector=OWNER%21%3DTILLER&resourceVersion=106522&timeout=7m45s&timeoutSeconds=465&watch=true[servicetopology]
I0515 08:54:53.378348       1 filter.go:236] -----------, StreamResponseFilter: kube-proxy watch endpointslices: https://127.0.0.1:6443/apis/discovery.k8s.io/v1beta1/endpointslices?allowWatchBookmarks=true&labelSelector=%21service.kubernetes.io%2Fheadless%2C%21service.kubernetes.io%2Fservice-proxy-name&resourceVersion=95879&timeout=7m15s&timeoutSeconds=435&watch=true[servicetopology]
I0515 08:54:53.381507       1 cache_manager.go:439] pod(kubelet/pods.v1.core/edgex-system/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp) is MODIFIED
I0515 08:54:53.382361       1 util.go:293] start proxying: post /api/v1/namespaces/edgex-system/events, in flight requests: 57
I0515 08:54:53.384234       1 util.go:293] start proxying: get /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp, in flight requests: 58
I0515 08:54:53.399293       1 util.go:252] kubelet get pods: /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp with status code 200, spent 14.95904ms
I0515 08:54:53.404357       1 util.go:252] kubelet create events: /api/v1/namespaces/edgex-system/events with status code 201, spent 21.91904ms
I0515 08:54:53.413732       1 cache_manager.go:439] pod(kubelet/pods.v1.core/edgex-system/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-7q5fw) is ADDED
I0515 08:54:53.418958       1 util.go:293] start proxying: get /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-7q5fw, in flight requests: 57
I0515 08:54:53.431835       1 util.go:252] kubelet get pods: /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-7q5fw with status code 200, spent 12.79584ms
I0515 08:54:53.434748       1 util.go:293] start proxying: patch /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-7q5fw/status, in flight requests: 57
I0515 08:54:53.464499       1 util.go:252] kubelet patch pods: /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-7q5fw/status with status code 200, spent 29.65184ms
I0515 08:54:53.472085       1 cache_manager.go:439] pod(kubelet/pods.v1.core/edgex-system/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-7q5fw) is MODIFIED
I0515 08:54:53.522483       1 filter.go:236] -----------, StreamResponseFilter: nginx-ingress-controller watch endpoints: https://127.0.0.1:6443/api/v1/endpoints?allowWatchBookmarks=true&labelSelector=OWNER%21%3DTILLER&resourceVersion=106522&timeout=7m45s&timeoutSeconds=465&watch=true[servicetopology]
I0515 08:54:53.523581       1 filter.go:236] -----------, StreamResponseFilter: coredns watch endpoints: https://127.0.0.1:6443/api/v1/endpoints?allowWatchBookmarks=true&resourceVersion=106522&timeout=8m27s&timeoutSeconds=507&watch=true[servicetopology]
I0515 08:54:53.610251       1 util.go:293] start proxying: put /apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/host-63b4e4?timeout=10s, in flight requests: 57
I0515 08:54:53.610532       1 util.go:252] kubelet update leases: /apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/host-63b4e4?timeout=10s with status code 200, spent 180.384µs
I0515 08:54:54.860103       1 util.go:293] start proxying: get /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp, in flight requests: 57
I0515 08:54:54.869684       1 util.go:252] kubelet get pods: /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp with status code 200, spent 9.468416ms
I0515 08:54:54.872070       1 util.go:293] start proxying: patch /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp/status, in flight requests: 57
I0515 08:54:54.904470       1 util.go:252] kubelet patch pods: /api/v1/namespaces/edgex-system/pods/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp/status with status code 200, spent 32.2888ms
I0515 08:54:54.905318       1 storage.go:562] key(kubelet/pods.v1.core/edgex-system/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp) storage is pending, just skip it
I0515 08:54:54.905354       1 cache_manager.go:652] skip to cache watch event because key({rootKey:false path:kubelet/pods.v1.core/edgex-system/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp}) is under processing
I0515 08:54:54.910532       1 cache_manager.go:439] pod(kubelet/pods.v1.core/edgex-system/edgex-redis-nodepool-host-63b4e4-trgcz-79b946796f-d5wvp) is MODIFIED

Why we need to save kubelet node object in cache:
In ServiceTopology and NodePortIsolation filter, handler need to get nodepool information from node,
in regular mode, Get function will get node object in disk, and it need to get mutex,
it will be concurrent sometimes, and return an error for 'specified key is under accessing'.
In the meanwhile, the filter will skip, and unfiltered data will return, it's so terrible for service in multiple nodepools.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

/assign @rambohe-ch

Does this PR introduce a user-facing change?


other Note

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openyurt-bot openyurt-bot requested review from qclc and YTGhost May 15, 2023 12:02
@openyurt-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: JameKeal
To complete the pull request process, please assign rambohe-ch
You can assign the PR to them by writing /assign @rambohe-ch in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openyurt-bot openyurt-bot added the size/L size/L: 100-499 label May 15, 2023
@codecov
Copy link

codecov bot commented May 15, 2023

Codecov Report

Merging #1461 (4e237df) into master (76cee88) will increase coverage by 0.25%.
The diff coverage is 80.76%.

❗ Current head 4e237df differs from pull request most recent head 3deeb27. Consider uploading reports for the commit 3deeb27 to get more accurate results

@@            Coverage Diff             @@
##           master    #1461      +/-   ##
==========================================
+ Coverage   50.74%   50.99%   +0.25%     
==========================================
  Files         131      133       +2     
  Lines       15576    15818     +242     
==========================================
+ Hits         7904     8067     +163     
- Misses       6951     7008      +57     
- Partials      721      743      +22     
Flag Coverage Δ
unittests 50.99% <80.76%> (+0.25%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/yurthub/filter/servicetopology/filter.go 61.53% <0.00%> (+5.24%) ⬆️
pkg/yurthub/filter/nodeportisolation/filter.go 65.54% <33.33%> (+5.68%) ⬆️
cmd/yurthub/app/config/config.go 53.96% <100.00%> (+0.35%) ⬆️
pkg/yurthub/cachemanager/cache_manager.go 71.98% <100.00%> (+0.44%) ⬆️
pkg/yurthub/filter/manager/manager.go 69.35% <100.00%> (ø)

... and 37 files with indirect coverage changes

@rambohe-ch
Copy link
Member

@Congrool PTAL

@Congrool
Copy link
Member

It's a solution that makes it work around. But we've already had an in-memory cache (which also cache kubelet nodes) in cache-manager , add another cache for same resource seems a bit redundant. I think it's caused by the implementation of our disk storage which should pend requests and solve them one by one instead of skiping it. So, I may prefer a solution of improving disk storage. What do you think? @JameKeal

@JameKeal
Copy link
Member Author

It's a solution that makes it work around. But we've already had an in-memory cache (which also cache kubelet nodes) in cache-manager , add another cache for same resource seems a bit redundant. I think it's caused by the implementation of our disk storage which should pend requests and solve them one by one instead of skiping it. So, I may prefer a solution of improving disk storage. What do you think? @JameKeal

OK, if there are already have a place to cache info, it's better to reuse it. I will see the cache_manager code and find the way to reslove it.

@rambohe-ch rambohe-ch added this to the controlplane-v1.4 milestone May 22, 2023
@sonarcloud
Copy link

sonarcloud bot commented Jun 13, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@JameKeal
Copy link
Member Author

@rambohe-ch @Congrool PTAL

@@ -45,9 +45,9 @@ type WantsNodePoolName interface {
SetNodePoolName(nodePoolName string) error
}

// WantsStorageWrapper is an interface for setting StorageWrapper
type WantsStorageWrapper interface {
SetStorageWrapper(s cachemanager.StorageWrapper) error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may not be a good idea to change the interface, since most of its capability is desigened for handling and caching request. The original handle flow is:

 req -> yurthub proxy server -> filters -> cachemanager

if filters also consist of cachemanager, it will be:

 req -> yurthub proxy server -> filters -> cachemanager -> cachemanager

There may be unexpected problems in yurthub cache.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This interface is only use in filter which want to get kubelet node data from disk stroge, and we discuss to use cachemanager in filter, there is no way to find the cachemanager in filter now, and if we use cachemanger, the storage is out of use.

What do you think, save this interface and create a use one?

@Congrool
Copy link
Member

Hey, this problem seems caused by simutanious access to kubelet/nodes.v1.core/nodename. When yurthub filter wants to read it, kubelet is updating the node resouce at the same time. Because the cache is not belongs to yurthub filter, so the confilt could happen.

So seperate the cache for such two components may help relieve such problem. For example, yurthub filter just access yurthub/nodes.v1.core/nodename while kubelet accesses kubelet/nodes.v1.core/nodename. Or embed an in-memory cache just for ServiceTopology filter.

@JameKeal
Copy link
Member Author

Hey, this problem seems caused by simutanious access to kubelet/nodes.v1.core/nodename. When yurthub filter wants to read it, kubelet is updating the node resouce at the same time. Because the cache is not belongs to yurthub filter, so the confilt could happen.

So seperate the cache for such two components may help relieve such problem. For example, yurthub filter just access yurthub/nodes.v1.core/nodename while kubelet accesses kubelet/nodes.v1.core/nodename. Or embed an in-memory cache just for ServiceTopology filter.

I don't think so, there are some compones in edge node that watch the endpoints, like coredns, kubeproxy and other user's compone (like nginx-controller), if some pod is restart, the endpoints is change very quickly and frequently, from A to empty and to B, if we cann't reslove the storage lock, it will happen again.

@JameKeal JameKeal requested a review from Congrool June 15, 2023 06:38
@rambohe-ch
Copy link
Member

@JameKeal @Congrool I want to post a pull request to handle this bug, the solution is that yurthub will get node object only once for getting nodepool name, because nodepool for a node can be changed only by deleting node and re-join the node into cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement kind/enhancement size/L size/L: 100-499
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants