Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor NTO to use controller runtime lib #302

Closed
wants to merge 3 commits into from

Conversation

yanirq
Copy link
Contributor

@yanirq yanirq commented Dec 15, 2021

Refactor cluster node tuning operator to use controller runtime library (release 0.11).
The functionality is internal only and replaces the direct application of a controller with the controller runtime scheme.

The new controller(s) structure:

  • 2 Controllers under one manager : CVO controller and Tuned controller using controller runtime library.
  • Metrics server added as runnable to the controller runtime manager.
  • Tuned daemon controller left untouched.

Current implementation that might be subject to change depending on the operator performance (or reviews):

  • Tuned controller does not keep internal maps for tracking node names and pods. It watches pod and node label changes and in the reconcile logic will list all nodes before syncing tuned profiles.
  • Pod labels watch will run always but will never act on changes if no tuned CR has pod type.

Pending tasks checklist:

  • Ensure metric reporting works as expected (not only on e2e)
  • e2e and unit test coverage should be added (either in this PR or the following)
  • Unused methods will be removed once the PR is in an acceptable review state.
  • Add profile deletion when a node is deleted
  • Test managed state
  • Test operator's performance vs previous implementation
  • Test all status reporting scenarios
  • Client-go leader election changes applied correctly

This is also a preliminary work to set up the stage for moving Performance addons operator under NTO as documented here: openshift/enhancements#867

@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Dec 15, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 15, 2021

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: yanirq
To complete the pull request process, please assign jmencak after the PR has been reviewed.
You can assign the PR to them by writing /assign @jmencak in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@yanirq yanirq force-pushed the refatcor_cr branch 2 times, most recently from 9c57e0c to ffe2c34 Compare December 20, 2021 11:05
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 20, 2021
@yanirq yanirq force-pushed the refatcor_cr branch 4 times, most recently from f8b874c to 98e56a8 Compare December 26, 2021 18:21
@yanirq
Copy link
Contributor Author

yanirq commented Dec 27, 2021

/test e2e-aws-operator

@yanirq yanirq force-pushed the refatcor_cr branch 4 times, most recently from 980b80f to df6183a Compare December 28, 2021 09:54
@yanirq yanirq force-pushed the refatcor_cr branch 4 times, most recently from cd47e3c to f307fb9 Compare January 4, 2022 18:51
@yanirq
Copy link
Contributor Author

yanirq commented Jan 5, 2022

/test e2e-aws-operator

@jmencak
Copy link
Contributor

jmencak commented Jan 10, 2022

Congrats on the e2e-aws-operator tests passing @yanirq , We will start a review this week.
/cc @dagrayvid

@yanirq
Copy link
Contributor Author

yanirq commented Jan 10, 2022

Congrats on the e2e-aws-operator tests passing @yanirq , We will start a review this week. /cc @dagrayvid

All the previous PRs can be closed.
There is a checklist still to be completed but reviews should definitely start asap.
You will notice that there are some//TODOsections with alternatives to the current implementation in some sections.

/cc @cynepco3hahue

@yanirq yanirq changed the title WIP: Refactor NTO to use controller runtime lib Refactor NTO to use controller runtime lib Jan 10, 2022
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 30, 2022
@yanirq
Copy link
Contributor Author

yanirq commented Jan 31, 2022

/retest

@yanirq
Copy link
Contributor Author

yanirq commented Feb 2, 2022

/retest

2 similar comments
@yanirq
Copy link
Contributor Author

yanirq commented Feb 3, 2022

/retest

@yanirq
Copy link
Contributor Author

yanirq commented Feb 6, 2022

/retest

@dagrayvid
Copy link
Contributor

@yanirq @jmencak,

I did a bit more testing similar to what Jiri already shared. I tested 3 cases for the old code and the new code: 1. fully idle 2. creating ~1000 pods in the background without any custom profile, 3. creating ~1000 pods in the background with a Profile that matches to those pods. Here are the results:

  1. Fully idle:
Old implementation:                 Controller-runtime rewrite:
2022-04-02 17:39:25: 101 18         2022-04-02 16:36:52: 103 20
2022-04-02 17:40:25: 103 19         2022-04-02 16:37:52: 106 22
2022-04-02 17:41:25: 105 20         2022-04-02 16:38:52: 110 24
2022-04-02 17:42:25: 105 20         2022-04-02 16:39:52: 114 25
delta:               4   2                               11  5
  1. Creating pods with no label matching used. Should be pretty idle…
Old implementation:                 Controller-runtime rewrite:
2022-04-02 17:46:33: 112 22         2022-04-02 16:48:01: 153 44
2022-04-02 17:47:33: 113 22         2022-04-02 16:49:01: 226 55
2022-04-02 17:48:33: 115 23         2022-04-02 16:50:01: 299 67
2022-04-02 17:49:33: 117 23         2022-04-02 16:51:01: 365 76
delta:               5   1                               212 32
  1. Creating pods with a profile that matches to the pods (3 master 3 worker cluster)
Old implementation:                 Controller-runtime rewrite:
2022-04-02 17:53:18: 169  33        2022-04-02 17:04:48: 104  21
2022-04-02 17:54:18: 539  56        2022-04-02 17:05:48: 1122 85
2022-04-02 17:55:18: 969  72        2022-04-02 17:06:48: 2959 189
2022-04-02 17:56:18: 1488 86        2022-04-02 17:07:48: 5740 299
delta:               1319 53                             5636 278

This confirms that the controller-runtime implementation is doing some work when pods are created even when the pod label matching functionality is unused. It also shows that when the pod label matching functionality is used, the new implementation is using many more CPU cycles than the old implementation.

@yanirq
Copy link
Contributor Author

yanirq commented Feb 8, 2022

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 8, 2022

@yanirq: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@yanirq
Copy link
Contributor Author

yanirq commented Feb 9, 2022

/hold - #316 is examined as an alternative for this PR

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 9, 2022
LeaderElection: true,
LeaderElectionID: config.OperatorLockName,
LeaderElectionNamespace: ntoNamespace,
LeaseDuration: &le.LeaseDuration.Duration,
Copy link

@camilamacedo86 camilamacedo86 Feb 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you defining a lease duration? see:

. // LeaseDuration is the duration that non-leader candidates will
// wait to force acquire leadership. This is measured against time of
// last observed ack. Default is 15 seconds.
LeaseDuration *time.Duration

Is not the default time good enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was to keep the original behavior before the refactor

restConfig := ctrl.GetConfigOrDie()
le := util.GetLeaderElectionConfig(restConfig, enableLeaderElection)
mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
NewCache: cache.MultiNamespacedCacheBuilder(namespaces),
Copy link

@camilamacedo86 camilamacedo86 Feb 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you like to use this option?
How many ns would you like to inform here?

See that if you try to cache all -1 or 2 namespaces on the cluster then, you will probably check performance issues.
Please, check the comment: https://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/cache/multi_namespace_cache.go#L40-L46

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main reason here was to watch cluster resources such as nodes and pods and have a distinct namespace for NTO to be used for filtering

LeaseDuration: &le.LeaseDuration.Duration,
RetryPeriod: &le.RetryPeriod.Duration,
RenewDeadline: &le.RenewDeadline.Duration,
Namespace: ntoNamespace,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are passing an NS and NewCache: cache.MultiNamespacedCacheBuilder(namespaces)
Note that you need to have permissions ( scope ) to read/update/delete resources and in this way cache them where the operator will be installed. Then, you can:

a) Watching/Catching resources in a set of Namespaces
It is possible to use MultiNamespacedCacheBuilder from Options to watch and manage resources in a set of Namespaces.

OR

b) Watching/Catching resources in a single Namespace (where the. operator will be installed) by using the Namespace option. See:

// Namespace if specified restricts the manager's cache to watch objects in
// the desired namespace Defaults to all namespaces
//
// Note: If a namespace is specified, controllers can still Watch for a
// cluster-scoped resource (e.g Node). For namespaced resources the cache
// will only hold objects from the desired namespace.
Namespace string

Ref: https://pkg.go.dev/sigs.k8s.io/controller-runtime@v0.11.1/pkg/manager#Manager

OR

c) Do not add both which means grant cluster-scope permission for the project. Your operator will be watching/catching the whole cluster

Be aware that how much more resources/namespaces do you catching/watching more resources you will consume.

RetryPeriod: &le.RetryPeriod.Duration,
RenewDeadline: &le.RenewDeadline.Duration,
Namespace: ntoNamespace,
})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -34,10 +34,10 @@ rules:
- apiGroups: ["security.openshift.io"]
Copy link

@camilamacedo86 camilamacedo86 Feb 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by adopting SDK/Kubebuilder you will be able to work with makers, e.g: https://github.com/operator-framework/operator-sdk/blob/master/testdata/go/v3/memcached-operator/controllers/memcached_controller.go#L44-L48

Then, when you run make generate the RBAC will be generated at the config]/rbac/ dir: https://github.com/operator-framework/operator-sdk/tree/master/testdata/go/v3/memcached-operator/config/rbac

Also, you can use make bundle and have the whole OLM bundle generated by you with all your kustomize configs, see: https://github.com/operator-framework/operator-sdk/tree/master/testdata/go/v3/memcached-operator/bundle

That is very helpful to work with the releases and provide the solutions for OLM.
You can also add customizations on in your base CSV: operator-sdk/testdata/go/v3/memcached-operator/config/manifests/bases/

To know more about the default layout see: https://sdk.operatorframework.io/docs/overview/project-layout/

@@ -1,13 +1,15 @@
package operator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What means mc.go?
What is its purpose?

return e.Object.GetName() == tunedv1.TunedClusterOperatorResourceName
},
UpdateFunc: func(e event.UpdateEvent) bool {
if !validateUpdateEvent(&e) {
Copy link

@camilamacedo86 camilamacedo86 Feb 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you doing it to address the scenarios that the reconciliation fails because the resource changed on the cluster? If yes, I'd suggest using the client and fetching the resource that you want to change/update before calling the update. Then, if it fails return err in the reconciliation to ensure that it will be executed again.

@@ -10,22 +10,26 @@ import (
"k8s.io/klog"
)

func GetLeaderElectionConfig(ctx context.Context, restcfg *rest.Config) configv1.LeaderElection {
// GetLeaderElectionConfig returns leader election configs defaults based on the cluster topology
func GetLeaderElectionConfig(restcfg *rest.Config, enabled bool) configv1.LeaderElection {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this ultis for leader election?
By default, you have the C+R implementation for leader election : https://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/leaderelection/leader_election.go

What is the extra requirement on top of that?

Also, you might give a look in the doc https://sdk.operatorframework.io/docs/building-operators/golang/advanced-topics/#leader-election which has compressive info about it.

@@ -24,10 +24,12 @@ require (
k8s.io/klog v1.0.0
k8s.io/klog/v2 v2.30.0
k8s.io/utils v0.0.0-20210930125809-cb0fa318a74b
sigs.k8s.io/controller-runtime v0.11.0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +127 to +128
sigs.k8s.io/controller-runtime => sigs.k8s.io/controller-runtime v0.11.0
sigs.k8s.io/controller-tools => sigs.k8s.io/controller-tools v0.7.0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need these replaces?
Why do you import sigs.k8s.io/controller-tools?

@yanirq
Copy link
Contributor Author

yanirq commented Feb 27, 2022

@camilamacedo86 Thank you for the extensive review, it is super informative and helpful (also for deeper understanding of controller runtime "under the hood").
This PR will probably be closed soon since we switched to an alternative solution but it will be kept in case we do choose to use this refactor in the future.
Some of the information is insightful also for the current implementation so thanks again ! I will answer the comments here that could be addressed.

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 27, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 27, 2022

@yanirq: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@camilamacedo86
Copy link

Hi @yanirq,

#302 (comment)
Thank you for your reply. I hope that it can help out.

Feel free to reach out if you need.

@yanirq
Copy link
Contributor Author

yanirq commented Mar 9, 2022

This PR will not be the format we will be using.
The solution is to copy over the PAO code and add it to the manager as a runnable instead of refactoring whole of NTO to use controller runtime.
The PR can be reopened or use as a reference if needed.

@yanirq yanirq closed this Mar 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants