Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌱 ClusterCacheTracker: use non-blocking per-cluster locking #7537

Conversation

sbueringer
Copy link
Member

@sbueringer sbueringer commented Nov 11, 2022

Co-authored-by: Florian Gutmann fgutmann@amazon.com

What this PR does / why we need it:
This PR builds on top of #6380. All credits to @fgutmann for the initial implementation.

As of today we have a global lock, which means that as soon as the clusterAccessor creation doesn't work quickly all workers of our reconcilers are blocked. This is most problematic with the Machine reconciler as usually the Machines have the highest cardinality. If the apiserver of a cluster is not reachable or slow it can take up to 10 seconds until the global lock is unlocked again. In this time no one else is able to create or retrieve a client for workload clusters.

Let's assume Cluster1 has 250 Machines and is not reachable and Cluster2 has 10 Machines and is reachable.

This leads to the following situation for the Machine reconciler:

Machine reconciler queue Machine reconciler workers Machine reconciler workers in GetClient
240 Machines (Cluster1)
10 Machines (Cluster2)
9 Machines (Cluster1) 1 Machine (Cluster1)

The first step is to split the global lock, this leads to the following:

Machine reconciler queue Machine reconciler workers Machine reconciler workers in GetClient
240 Machines (Cluster1)
9 Machines (Cluster2)
8 Machines (Cluster1) 1 Machine (Cluster1)
1 Machine (Cluster2)

While this improves the situation we now have the problem that 8 Machines are actively waiting and blocking the worker nodes. This means that it will be very hard for the 10 Machines of Cluster2 to ever get reconciled. This of course gets a lot worse the more Cluster and Machines we have and if multiple Clusters are unreachable.

This is why we now don't let workers actively wait. If a worker can't get a lock for a Cluster an error is returned and the Machine is moved back into the queue with an exponential backoff.

Machine reconciler queue Machine reconciler workers Machine reconciler workers in GetClient
248 Machines (Cluster1)
9 Machines (Cluster2)
1 Machine (Cluster1)
1 Machine (Cluster2)

This keeps the worker go routines free and increases the chance of all Machines to get reconciled.

Of course there are limits if for example there are far more clusters unreachable then we have workers. But I think this case would require a larger refactoring and with the new implementation + exponential backoff we should be fine.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 11, 2022
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 11, 2022
@sbueringer sbueringer force-pushed the pr-cc-per-cluster-at-most-once-lock branch from 37a60e5 to 59de0a9 Compare November 11, 2022 18:03
@sbueringer
Copy link
Member Author

/assign @fgutmann
/assign @fabriziopandini
/assign @vincepri

@k8s-ci-robot
Copy link
Contributor

@sbueringer: GitHub didn't allow me to assign the following users: fgutmann.

Note that only kubernetes-sigs members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @fgutmann
/assign @fabriziopandini
/assign @vincepri

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@fgutmann fgutmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for picking this up Stefan!

The approach overall looks good to me. One drawback of it is that there will be some unneccesary backoffs during normal operations. Generally, I'm not too worried about that as I think it won't affect the overall system performance much.

One aspect I don't like about the current error handling though is that they are treated as a regular reconcile error and will show up in error metrics, making it hard to properly alarm on them. Shall we update the callers to go into back-off without returning an error for this specific case?

controllers/remote/cluster_cache_tracker.go Outdated Show resolved Hide resolved
controllers/remote/cluster_cache_tracker.go Outdated Show resolved Hide resolved
controllers/remote/keyedmutex.go Outdated Show resolved Hide resolved
controllers/remote/cluster_cache_tracker.go Outdated Show resolved Hide resolved
controllers/remote/cluster_cache_tracker.go Show resolved Hide resolved
controllers/remote/keyedmutex.go Outdated Show resolved Hide resolved
controllers/remote/cluster_cache_tracker.go Outdated Show resolved Hide resolved
controllers/remote/cluster_cache_tracker.go Outdated Show resolved Hide resolved
controllers/remote/cluster_cache_tracker.go Outdated Show resolved Hide resolved
@sbueringer
Copy link
Member Author

@fgutmann @vincepri Thank you very much for the reviews. PTAL :)

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 14, 2022
@sbueringer sbueringer force-pushed the pr-cc-per-cluster-at-most-once-lock branch from 4036110 to 1318edc Compare November 14, 2022 15:12
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 14, 2022
@sbueringer sbueringer force-pushed the pr-cc-per-cluster-at-most-once-lock branch from 1318edc to 86ce450 Compare November 14, 2022 15:28
@fabriziopandini
Copy link
Member

Per lgtm to me but I'll wait for @fgutmann and @vincepri last pass before applying any label

@fgutmann
Copy link
Contributor

/lgtm

@k8s-ci-robot
Copy link
Contributor

@fgutmann: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vincepri
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vincepri

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 14, 2022
Co-authored-by: Florian Gutmann <fgutmann@amazon.com>
@sbueringer sbueringer force-pushed the pr-cc-per-cluster-at-most-once-lock branch from 5fc0986 to 1348144 Compare November 14, 2022 17:44
@sbueringer
Copy link
Member Author

Thx for the reviews!

Squashed!

@sbueringer
Copy link
Member Author

I'll open cherry-pick PR for release-1.2 once this PR is merged

@vincepri
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 14, 2022
@k8s-ci-robot k8s-ci-robot merged commit 75d0b22 into kubernetes-sigs:main Nov 14, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.3 milestone Nov 14, 2022
@sbueringer sbueringer deleted the pr-cc-per-cluster-at-most-once-lock branch November 14, 2022 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants