Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[jobframework] Use the jobs uid in workload's name hash. #1732

Merged
merged 3 commits into from
Feb 23, 2024

Conversation

trasc
Copy link
Contributor

@trasc trasc commented Feb 14, 2024

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

[jobframework] Use the jobs uuid in workload's name hash. With this we can avoid name collisions for workloads create for jobs having the same name and type created at different pints in time.

Which issue(s) this PR fixes:

Fixes #1726

Special notes for your reviewer:

Does this PR introduce a user-facing change?

The hash suffix of the workload's name are now influenced by the job's object UID. Recreated jobs with the same name and kind will use different workload names.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 14, 2024
Copy link

netlify bot commented Feb 14, 2024

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit 791e737
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/65d6f74500977600080a03e9

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 14, 2024
@trasc
Copy link
Contributor Author

trasc commented Feb 14, 2024

/cc @alculquicondor

Copy link
Contributor

@mimowo mimowo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think it would be good to get to the bottom of 1726 before adding UID to the hash. At least I still don't fully understand it, so maybe just explaining would be enough if you already know. I'm worried that with this change with risk covering some issues, and complications on upgrade.

Concrete questions:

  1. in case there is a conflict in a scenario when a job is recreated with different pod spec, so that the workload is not longer equivalent, should we not trigger recreation the workload?
  2. what is the impact during upgrade, what will happen to user workloads which are pending, and admitted ones?

@alculquicondor
Copy link
Contributor

I'm ok with this change, but before we merge it, I think we should make sure we are not depending on the predictability of the workload name.

@tenzen-y
Copy link
Member

What happens if admins do cluster migration?
We often migrate the existing cluster to a new cluster to update the k8s version with backup and restore etcd.

I guess that workloads are just recreated with a new name since Job UID is changed by migration, right?
So, there is no impact, right?

@mimowo
Copy link
Contributor

mimowo commented Feb 15, 2024

I still have concerns about this approach, summarized here: #1726 (comment).

EDIT: Having said that, I'm ok with this if we find it "least" evil, but I think it warrants a more descriptive release-note, describing the user-facing impact.

EDIT2: I checked that if I do in-place upgrade then the jobs are not actually stopped. In that case the concerns aren't that important, I'm ok with this change too.

@trasc
Copy link
Contributor Author

trasc commented Feb 15, 2024

@alculquicondor
Passed the workload creation the workload name is irrelevant for Kueue, job reconciliation is done based on ownership (with the exception of prebuilt workloads when there is no relationship between the wl and job name).

From a human user perspective the wl names will continue to look similar <job-prefix>-<job-name>-<hash-segment>, where hash-segment is an opaque string.

@tenzen-y
For etcd backup/restore I'd expect the workloads to be part of that process as well, if not, indeed the should just be recreated. (are the UIDs changed in migration?)

@tenzen-y
Copy link
Member

For etcd backup/restore I'd expect the workloads to be part of that process as well, if not, indeed the should just be recreated. (are the UIDs changed in migration?)

If we directly backup and restore etcd, UID wouldn't be changed, but if we use cluster backup tools like the velero, UID will be changed. I'm ok with only recreating the workloads.

@alculquicondor
Copy link
Contributor

My worry is this line:

if pwName, err := jobframework.GetWorkloadNameForOwnerRef(owner); err != nil {

@trasc
Copy link
Contributor Author

trasc commented Feb 15, 2024

GetWorkloadNameForOwnerRef

Yes , this could be problematic in case of kueue upgrades, it could keep the child job Suspended if the workload for the parent was created before the upgrade.

A similar problem we have with prebuilt workloads, but only for jobsets at this point, and we do this

// If a prebuilt workload is used, propagate the name to it's replicated job templates.
if wlName, hasPrebuilt := jobframework.PrebuiltWorkloadFor(jobSet); hasPrebuilt {
for i := range jobSet.Spec.ReplicatedJobs {
rjTemplate := &jobSet.Spec.ReplicatedJobs[i].Template
if rjTemplate.Annotations == nil {
rjTemplate.Annotations = make(map[string]string)
}
rjTemplate.Annotations[constants.ParentWorkloadAnnotation] = wlName
}
}

to overcome it.

Maybe we could droop the usage of the parentWlAnotation and cover the prebuilt scenario as well.

@alculquicondor
Copy link
Contributor

Maybe we could drop the usage of the parentWlAnotation and cover the prebuilt scenario as well.

Can you expand on this?

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 16, 2024
@trasc
Copy link
Contributor Author

trasc commented Feb 16, 2024

Maybe we could drop the usage of the parentWlAnotation and cover the prebuilt scenario as well.

Can you expand on this?

We can use an ownership based approach, the child job and the worker have the same parent . 4dc9ba0

@alculquicondor
Copy link
Contributor

I like that. Let's start with a PR just for that.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 16, 2024
@trasc
Copy link
Contributor Author

trasc commented Feb 16, 2024

I like that. Let's start with a PR just for that.

Moved in #1747

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 20, 2024
With this we can avoid name collisions for workloads create for
jobs having the same name and type created at different pints in time.
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 20, 2024
Copy link
Contributor

@mimowo mimowo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM, just some nits. Also, a suggestion for potential follow up.

I'm on a fence regarding the release note saying "to avoid potential name collisions". They are rare at rate 1/16mln they are still sort of possible to observe. Maybe we can rephrase a little not to be definitive.

pkg/controller/jobs/pod/pod_controller_test.go Outdated Show resolved Hide resolved
pkg/controller/jobs/pod/pod_controller_test.go Outdated Show resolved Hide resolved
pkg/controller/jobframework/workload_names.go Show resolved Hide resolved
@mimowo
Copy link
Contributor

mimowo commented Feb 20, 2024

Should we update the PR description to note that it fixes #1726? Or we believe there is more to do to close it, like the GC? In that case we may just mention "Part of #1726".

@trasc
Copy link
Contributor Author

trasc commented Feb 20, 2024

Should we update the PR description to note that it fixes #1726? Or we believe there is more to do to close it, like the GC? In that case we may just mention "Part of #1726".

Let's go with Fix, since the issue cannot be reproduce after adding this.

@mimowo
Copy link
Contributor

mimowo commented Feb 22, 2024

/lgtm
IIUC the remaining issue is that when a running Job is deleted with --cascade orphan, then the workload remains admitted and continues to consume quota, which we want to tackle by GC (either removal or marking as finished). If this is correct, could you please open a follow up issue?

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 22, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: fb96d8b5f91182d9714c4384522dfdba0f546a99

@mimowo
Copy link
Contributor

mimowo commented Feb 22, 2024

For the release note, maybe something more user-oriented:
"The workload name is no longer the same for recreated jobs (with the same namespace and name),
because Job UID is used to determine the hash-like suffix in the workload name."

WDYT @tenzen-y

@tenzen-y
Copy link
Member

For the release note, maybe something more user-oriented: "The workload name is no longer the same for recreated jobs (with the same namespace and name), because Job UID is used to determine the hash-like suffix in the workload name."

WDYT @tenzen-y

@mimowo Generally, sgtm, but it is better to emphasize that the creating workloads names lose consistency or obtain completely random names. WDYT?

@mimowo
Copy link
Contributor

mimowo commented Feb 22, 2024

@mimowo Generally, sgtm, but it is better to emphasize that the creating workloads names lose consistency or obtain completely random names. WDYT?

Yeah, it is tricky, they are "completely random" from user perspective, but technically they are predictible, so I wanted to have something which captures the situation from user perspective (different names on recreation), yet don't say "random". However, I'm ok adding a sentence to clarify "The workload names are virtually random now." (or something like that).

@trasc
Copy link
Contributor Author

trasc commented Feb 22, 2024

"The workload names are virtually random now."

Cannot be added since they are not random.

@mimowo
Copy link
Contributor

mimowo commented Feb 22, 2024

"The workload names are virtually random now."

Cannot be added since they are not random.

Well, I think saying "virtually" random, or "random from user's perspective" is fine (after saying that it is a hash-like in the previous sentence), but I don't have a strong opinion. Still, I think we should make set the release note to something understandable to users. WDYT @tenzen-y ?

@trasc
Copy link
Contributor Author

trasc commented Feb 22, 2024

Release note updated

@@ -2164,7 +2164,7 @@ func TestReconciler(t *testing.T) {
Effect: corev1.TaintEffectNoSchedule,
}).Obj(),
workloads: []kueue.Workload{
*utiltesting.MakeWorkload(GetWorkloadNameForJob(baseJobWrapper.Name), "ns").
*utiltesting.MakeWorkload(GetWorkloadNameForJob(baseJobWrapper.Name, baseJobWrapper.GetUID()), "ns").
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have another unit test where we put a completely different name, but the reconciler still recognizes it as the workload for the job?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be a bit of a stretch and has nothing to do with the current PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does have to do. It's one minimal safety guard to ensure that existing jobs (before upgrading to the latest version) continue to run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not add this when owner based workload section was added?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because of bad foresight, but it wasn't that important then.

Copy link
Contributor Author

@trasc trasc Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the vast majority of tests the workload has name a not the name that is usually generated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, thanks for checking. That should be enough.

@alculquicondor
Copy link
Contributor

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, trasc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 23, 2024
@tenzen-y
Copy link
Member

"The workload names are virtually random now."

Cannot be added since they are not random.

Well, I think saying "virtually" random, or "random from user's perspective" is fine (after saying that it is a hash-like in the previous sentence), but I don't have a strong opinion. Still, I think we should make set the release note to something understandable to users. WDYT @tenzen-y ?

Sorry for the late. It sounds good to me.

@k8s-ci-robot k8s-ci-robot merged commit 6244715 into kubernetes-sigs:main Feb 23, 2024
14 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v0.7 milestone Feb 23, 2024
@trasc trasc deleted the job-uid-wl-hash branch March 12, 2024 07:28
vsoch pushed a commit to researchapps/kueue that referenced this pull request Apr 18, 2024
…sigs#1732)

* [jobframework] Use the jobs uuid in workload's name hash.

With this we can avoid name collisions for workloads create for
jobs having the same name and type created at different pints in time.

* Review Remarks

* Review Remarks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Job gets admitted, but is never un-suspended.
5 participants