generated from kubernetes/kubernetes-template-project
-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[multikueue] add support for custom multikueue controller #2405
Draft
mszadkow
wants to merge
5
commits into
kubernetes-sigs:main
Choose a base branch
from
epam:feature/multikueue-support-for-custom-jobs
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from 4 commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
6866b9b
Add ControllerName to the MultiClusterSpec
mszadkow dbad921
Add ControllerName as Multikueue setup option
mszadkow 5da30da
Update kueue cmd
mszadkow ee5a798
Add tests for distinct multikueues operating with different adapters
mszadkow c1e4244
pass controllerName to IsJobManagedByKueue
mszadkow File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
11 changes: 10 additions & 1 deletion
11
client-go/applyconfiguration/kueue/v1alpha1/multikueueclusterspec.go
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,6 +21,7 @@ import ( | |
|
||
ctrl "sigs.k8s.io/controller-runtime" | ||
|
||
kueuealpha "sigs.k8s.io/kueue/apis/kueue/v1alpha1" | ||
"sigs.k8s.io/kueue/pkg/constants" | ||
"sigs.k8s.io/kueue/pkg/controller/jobframework" | ||
) | ||
|
@@ -32,10 +33,12 @@ const ( | |
) | ||
|
||
type SetupOptions struct { | ||
controllerName string | ||
gcInterval time.Duration | ||
origin string | ||
workerLostTimeout time.Duration | ||
eventsBatchPeriod time.Duration | ||
adapters map[string]jobframework.MultiKueueAdapter | ||
} | ||
|
||
type SetupOption func(o *SetupOptions) | ||
|
@@ -72,13 +75,41 @@ func WithEventsBatchPeriod(d time.Duration) SetupOption { | |
} | ||
} | ||
|
||
func SetupControllers(mgr ctrl.Manager, namespace string, opts ...SetupOption) error { | ||
options := &SetupOptions{ | ||
// WithControllerName - sets the controller name for which the multikueue | ||
// admission check match. | ||
func WithControllerName(controllerName string) SetupOption { | ||
return func(o *SetupOptions) { | ||
o.controllerName = controllerName | ||
} | ||
} | ||
|
||
// WithAdapters - sets all the MultiKueue adaptors. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. typo |
||
func WithAdapters(adapters map[string]jobframework.MultiKueueAdapter) SetupOption { | ||
return func(o *SetupOptions) { | ||
o.adapters = adapters | ||
} | ||
} | ||
|
||
// WithAdapters - sets or updates the adadpter of the MultiKueue adaptors. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fix comment |
||
func WithAdapter(adapter jobframework.MultiKueueAdapter) SetupOption { | ||
return func(o *SetupOptions) { | ||
o.adapters[adapter.GVK().String()] = adapter | ||
} | ||
} | ||
|
||
func NewSetupOptions() *SetupOptions { | ||
return &SetupOptions{ | ||
gcInterval: defaultGCInterval, | ||
origin: defaultOrigin, | ||
workerLostTimeout: defaultWorkerLostTimeout, | ||
eventsBatchPeriod: constants.UpdatesBatchPeriod, | ||
controllerName: kueuealpha.MultiKueueControllerName, | ||
adapters: make(map[string]jobframework.MultiKueueAdapter), | ||
} | ||
} | ||
|
||
func SetupControllers(mgr ctrl.Manager, namespace string, opts ...SetupOption) error { | ||
options := NewSetupOptions() | ||
|
||
for _, o := range opts { | ||
o(options) | ||
|
@@ -95,23 +126,18 @@ func SetupControllers(mgr ctrl.Manager, namespace string, opts ...SetupOption) e | |
return err | ||
} | ||
|
||
adapters, err := jobframework.GetMultiKueueAdapters() | ||
if err != nil { | ||
return err | ||
} | ||
|
||
cRec := newClustersReconciler(mgr.GetClient(), namespace, options.gcInterval, options.origin, fsWatcher, adapters) | ||
cRec := newClustersReconciler(mgr.GetClient(), namespace, *options, fsWatcher) | ||
err = cRec.setupWithManager(mgr) | ||
if err != nil { | ||
return err | ||
} | ||
|
||
acRec := newACReconciler(mgr.GetClient(), helper) | ||
acRec := newACReconciler(mgr.GetClient(), helper, *options) | ||
err = acRec.setupWithManager(mgr) | ||
if err != nil { | ||
return err | ||
} | ||
|
||
wlRec := newWlReconciler(mgr.GetClient(), helper, cRec, options.origin, options.workerLostTimeout, options.eventsBatchPeriod, adapters) | ||
wlRec := newWlReconciler(mgr.GetClient(), helper, cRec, *options) | ||
return wlRec.setupWithManager(mgr) | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We deliberately didn't make it configurable if the first iteration to limit the potential of misconfiguring the system. Could you first open the issue to discuss the need for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tenzen-y is this change required in your understanding of #2349?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The status of a multikueue cluster follows the state of a contreller-runtime client in the kueue-controller-manager, if we want to be able to have multiple controller managers working with the clusters we need a way to mark who is managing each cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, but I'm wondering if we could alternatively still use MultiKueue controller, but have a concept of external plugin?
To make it easier to track of an object is managed by MultiKueue or not. I feel this would help OnCall, but maybe I need to think about this more.
Since this required API changes should we not first go via KEP to discuss alternatives? WDYT @alculquicondor, @tenzen-y?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should only be one controller manager modifying these objects (Kueue).
Other controller managers should only be in charge of mirroring the Job CRDs, but otherwise they can use all the same objects.
I would like to see at least a description of what the plan is, how components should interact, what each of them is responsible of. Maybe a KEP is worth, but please start with some description before writing more code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That will be a lot more complicated and will most likely require some additional API object type(s) to communicate between that custom controller to multikueue admission check controller.
The target of this PR is to be able to instantiate multikueue admission check controllers having different sets of adapters that can work in the same cluster at the same time.
MultiKueueClusters should not be that different then AdmissionChecks, which by design are managed by an AdmissionCheckController running in or outside of Kueue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current approach will lose the consistency between the kueue-manager and external controllers since it seems that the current approach allows us to re-implement the multikueue admission controller.
Can we consider an alternative approach to prevent the losing specification consistency?
As I described in the #2349, nice to start in the synchronizing design and the actual implementations.