-
Notifications
You must be signed in to change notification settings - Fork 398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize yurthub kubelet node cache operator #1461
Conversation
@JameKeal: GitHub didn't allow me to assign the following users: your_reviewer. Note that only openyurtio members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: JameKeal The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Codecov Report
@@ Coverage Diff @@
## master #1461 +/- ##
==========================================
+ Coverage 50.74% 50.99% +0.25%
==========================================
Files 131 133 +2
Lines 15576 15818 +242
==========================================
+ Hits 7904 8067 +163
- Misses 6951 7008 +57
- Partials 721 743 +22
Flags with carried forward coverage won't be shown. Click here to find out more.
|
@Congrool PTAL |
It's a solution that makes it work around. But we've already had an in-memory cache (which also cache kubelet nodes) in cache-manager , add another cache for same resource seems a bit redundant. I think it's caused by the implementation of our disk storage which should pend requests and solve them one by one instead of skiping it. So, I may prefer a solution of improving disk storage. What do you think? @JameKeal |
OK, if there are already have a place to cache info, it's better to reuse it. I will see the cache_manager code and find the way to reslove it. |
ee5c29d
to
3deeb27
Compare
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
@rambohe-ch @Congrool PTAL |
@@ -45,9 +45,9 @@ type WantsNodePoolName interface { | |||
SetNodePoolName(nodePoolName string) error | |||
} | |||
|
|||
// WantsStorageWrapper is an interface for setting StorageWrapper | |||
type WantsStorageWrapper interface { | |||
SetStorageWrapper(s cachemanager.StorageWrapper) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may not be a good idea to change the interface, since most of its capability is desigened for handling and caching request. The original handle flow is:
req -> yurthub proxy server -> filters -> cachemanager
if filters also consist of cachemanager, it will be:
req -> yurthub proxy server -> filters -> cachemanager -> cachemanager
There may be unexpected problems in yurthub cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This interface is only use in filter which want to get kubelet node data from disk stroge, and we discuss to use cachemanager in filter, there is no way to find the cachemanager in filter now, and if we use cachemanger, the storage is out of use.
What do you think, save this interface and create a use one?
Hey, this problem seems caused by simutanious access to So seperate the cache for such two components may help relieve such problem. For example, yurthub filter just access |
I don't think so, there are some compones in edge node that watch the endpoints, like coredns, kubeproxy and other user's compone (like nginx-controller), if some pod is restart, the endpoints is change very quickly and frequently, from A to empty and to B, if we cann't reslove the storage lock, it will happen again. |
What type of PR is this?
/kind enhancement
What this PR does / why we need it:
In current mode, yurthub has some error when i delete a pod, and the service will find other pods that is in different nodepools:
And i add some logs for serviceTopology, it can show us some details:
Why we need to save kubelet node object in cache:
In ServiceTopology and NodePortIsolation filter, handler need to get nodepool information from node,
in regular mode, Get function will get node object in disk, and it need to get mutex,
it will be concurrent sometimes, and return an error for 'specified key is under accessing'.
In the meanwhile, the filter will skip, and unfiltered data will return, it's so terrible for service in multiple nodepools.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
/assign @rambohe-ch
Does this PR introduce a user-facing change?
other Note