-
Notifications
You must be signed in to change notification settings - Fork 971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add podgroup controller #401
Conversation
#165 is huge, so split smaller one, #401 is part 1, #370 is part 3. @hzxuzhonghu you can review #401 first. |
Hey @wangyuqing4, TravisCI finished with status TravisBuddy Request Identifier: 395cfd80-b34d-11e9-a522-656c855f12dd |
ok |
switch obj.(type) { | ||
case *v1.Pod: | ||
pod := obj.(*v1.Pod) | ||
if pod.Spec.SchedulerName == "volcano" && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not hard code scheduler name here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
queue: workqueue.NewRateLimitingQueue(workqueue.DefaultControllerRateLimiter()), | ||
} | ||
|
||
cc.sharedInformers = informers.NewSharedInformerFactory(cc.kubeClients, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One point: we should prevent duplicate informers in different controllers. It is very costly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok,follow-up pr will optimize the part,gc/queue/job/pg controller will delete same code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, please file an issue to track
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think overall lgtm, but some nits
func (cc *Controller) Run(stopCh <-chan struct{}) { | ||
go cc.sharedInformers.Start(stopCh) | ||
go cc.podInformer.Informer().Run(stopCh) | ||
go cc.pgInformer.Informer().Run(stopCh) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We donot need Run informers separately as sharedInformers Start will Run the informers created from informer factory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same happens to other controllers
} | ||
|
||
req := podRequest{ | ||
pod: pod, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, you just need the pod ns/name
func (cc *Controller) Run(stopCh <-chan struct{}) { | ||
go cc.sharedInformers.Start(stopCh) | ||
go cc.podInformer.Informer().Run(stopCh) | ||
go cc.pgInformer.Informer().Run(stopCh) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc.sharedInformers.Star
will run all its informers.
) | ||
|
||
type podRequest struct { | ||
pod *v1.Pod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only need pod ns/name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can not use pointer here: the same pod may have different object in cache.
// Run start NewPodgroupController | ||
func (cc *Controller) Run(stopCh <-chan struct{}) { | ||
go cc.sharedInformers.Start(stopCh) | ||
go cc.podInformer.Informer().Run(stopCh) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if start sharedInformer, it's not necessary to start pod informer again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing happens to other controllers
if pod.Annotations[scheduling.GroupNameAnnotationKey] == "" { | ||
pod.Annotations[scheduling.GroupNameAnnotationKey] = pgName | ||
} else { | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log an error message if pod.Annotations[scheduling.GroupNameAnnotationKey] != pgName
.
ObjectMeta: metav1.ObjectMeta{ | ||
Namespace: pod.Namespace, | ||
Name: pgName, | ||
OwnerReferences: pod.OwnerReferences, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if pod's ownerReferences is empty, set PodGroup's ower references to this Pod for GC.
func generatePodgroupName(pod *v1.Pod) string { | ||
pgName := vkbatchv1.PodgroupNamePrefix | ||
if len(pod.OwnerReferences) != 0 { | ||
pgName += string(pod.OwnerReferences[0].UID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should find the controlled owner
Hey @wangyuqing4, TravisCI finished with status TravisBuddy Request Identifier: e001cd50-b43d-11e9-a5f8-ad46f5ea24b0 |
|
||
func newPGOwnerReferences(pod *v1.Pod) []metav1.OwnerReference { | ||
if len(pod.OwnerReferences) != 0 { | ||
return pod.OwnerReferences |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there maybe not controlled ownerference there.
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: k82cn, wangyuqing4 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Fixes #134
if normal pod/job/...... use volcano scheduler, podgroup controller watch pod, can create podgroup.