-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Coordinator field to JobSet spec #618
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danielvegamyhre The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
✅ Deploy Preview for kubernetes-sigs-jobset canceled.
|
a9b40a6
to
95fc11b
Compare
@@ -44,6 +44,11 @@ const ( | |||
// JobSetControllerName is the reserved value for the managedBy field for the built-in | |||
// JobSet controller. | |||
JobSetControllerName = "jobset.sigs.k8s.io/jobset-controller" | |||
|
|||
// CoordinatorKey is used as an annotation and label on Jobs and Pods. If the JobSet spec | |||
// defines the .spec.coordinator field, this annotation/label will be added to store a stable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, I have a small question, why do we need to put this tag on both the annotation and the label? Is it not enough to just use the label?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but inconsistency between labels vs annotations and Job labels/annotations vs Pod labels/annotations has led to bugs in the past, as well as confusion for users. So while the API is still in alpha I've just been adding everything as both a label and annotation to both Jobs and Pods, so there is full consistency across Jobs, pods, labels, and annotations. Then for graduation to v1 I plan on reviewing these and assigning them more thoughtfully, and documenting them in the JobSet docsite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood, thanks for your reply
I have a small question: Does this feature require the use of featureGate? |
i don't think we'll add a feature gate for this one since it is isn't a risky feature. It just sets a new annotation and label on jobs and pods if the field is set. |
cc @kannon92 would you ming taking a look at this if you have time? Abdullah is OOO currently. Happy to discuss the feature/context further if you'd like. |
@@ -746,6 +746,12 @@ func labelAndAnnotateObject(obj metav1.Object, js *jobset.JobSet, rjob *jobset.R | |||
annotations[jobset.JobIndexKey] = strconv.Itoa(jobIdx) | |||
annotations[jobset.JobKey] = jobHashKey(js.Namespace, jobName) | |||
|
|||
// Apply coordinator annotation/label if a coordinator is defined in the JobSet spec. | |||
if js.Spec.Coordinator != nil { | |||
labels[jobset.CoordinatorKey] = coordinatorEndpoint(js) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One subtle bug here would be that if someone specifies these labels we would overwrite.
Do we want a warning if someone uses these labels?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I have a follow up PR with webhook changes ready to go after this is merged. Will tag you on it shortly.
------------ | ------------- | ------------- | ------------- | ||
**job_index** | **int** | JobIndex is the index of Job which contains the coordinator pod (i.e., for a ReplicatedJob with N replicas, there are Job indexes 0 to N-1). Defaults to 0 if unset. | [optional] | ||
**pod_index** | **int** | PodIndex is the Job completion index of the coordinator pod. Defaults to 0 if unset. | [optional] | ||
**replicated_job** | **str** | ReplicatedJob is the name of the ReplicatedJob which contains the coordinator pod. | [default to ''] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do we get "default to ''" from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, I think because the replicatedJob field does not have "omitempty" so it will default to empty string if unset. In the follow up PR with webhook changes I will validate it set to a valid replicated job name, though.
/hold /lgtm Code looks good. I have some minor nits. Feel free to unhold. |
18b12de
to
8a1b2ee
Compare
@kannon92 can you reapply LGTM when you have a sec please? I made changes to address the comments and it was removed |
/hold cancel |
Part of #617. I plan on adding validation logic for this new field in a follow up PR, as well as more tests.
jobset.sigs.k8s.io/coordinator
, which will store the stable network endpoint where the coordinator pod can be reached, when thejobset.spec.coordinator
field is defined.