Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set pod affinity for host assisted clone #2647

Merged

Conversation

ido106
Copy link
Contributor

@ido106 ido106 commented Mar 19, 2023

Signed-off-by: Ido Aharon iaharon@redhat.com

What this PR does / why we need it:
Pods using the same PVC are scheduled on different nodes. We expect them to be scheduled on the same K8S node but they don't and thus cannot be attached to the PersistentVolume. The solution is to set pod affinity.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #2576

Special notes for your reviewer:

Release note:

Set pod affinity for host assisted clone source pods, in order to schedule them on the same K8S node.

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 19, 2023
@kubevirt-bot
Copy link
Contributor

Hi @ido106. Thanks for your PR.

I'm waiting for a kubevirt member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@akalenyu
Copy link
Collaborator

/ok-to-test

@kubevirt-bot kubevirt-bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 19, 2023
@@ -2911,6 +2915,8 @@ func doFileBasedCloneTest(f *framework.Framework, srcPVCDef *v1.PersistentVolume
targetDV.Labels["test-label-1"] = "test-label-key-1"
targetDV.Annotations["test-annotation-1"] = "test-annotation-key-1"

targetDV.Annotations[controller.AnnPodRetainAfterCompletion] = "true"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This already happens in the test at

targetDV.Annotations[controller.AnnPodRetainAfterCompletion] = "true"
Do we still want to keep?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is enough. I will delete it and run the tests again.

Copy link
Member

@awels awels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also verify in a functional test that when we clone multiple times from the same source, and the source is RWO, that all the source pods are on the same node? We already have a multiple clone test here

We just need to modify it to check if all the pods are on the same node IF the source volume is RWO (which is not always the case, nfs and ceph will be RWX)

@@ -650,6 +656,25 @@ func MakeCloneSourcePodSpec(sourceVolumeMode corev1.PersistentVolumeMode, image,
pod.Spec.Affinity.PodAffinity = &corev1.PodAffinity{}
}

if sourcePvc.Spec.AccessModes[0] == corev1.ReadWriteOnce {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is entirely possible to have multiple accessModes (Our manually created NFS test instances do this) and there is no guarantee that the first element is the RWO access mode. So I would loop over the list and find if any of them are RWO. But if there is RWO and RWX, you don't need this because the RWX part will take care of the problem.

Or check if the length of the list is 1 first. And if it is 1 then check if it is RWO. Otherwise don't use the affinity

pkg/controller/clone-controller.go Show resolved Hide resolved
@awels
Copy link
Member

awels commented Mar 21, 2023

Changes look good, but I think you might have missed my overall comment about verifying that when we do multiple clones at once (we have tests that already do this) we should ensure that the source pods are all on the same node. I put in a link to the tests.

@akalenyu
Copy link
Collaborator

Can we also verify in a functional test that when we clone multiple times from the same source, and the source is RWO, that all the source pods are on the same node? We already have a multiple clone test here

We just need to modify it to check if all the pods are on the same node IF the source volume is RWO (which is not always the case, nfs and ceph will be RWX)

So this test actually doesn't test ceph with host assisted (only smart clone), that is why I made #2575
(#2576 is only interesting in case the storage is not topology constrained).

I canceled the hold on that PR so we can decide if we want it in

@ido106
Copy link
Contributor Author

ido106 commented Mar 26, 2023

Changes look good, but I think you might have missed my overall comment about verifying that when we do multiple clones at once (we have tests that already do this) we should ensure that the source pods are all on the same node. I put in a link to the tests.

The problem is that this test will work anyway even without these changes. Maybe I can create 2 clones with 2 different source pvc's and check if each one was placed on the correct node ?
I added the test you asked for (locally) so I can push it but maybe we should check it another way.

@awels
Copy link
Member

awels commented Apr 12, 2023

/retest-required

Copy link
Member

@awels awels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good just one minor comment. was having a hard time understanding what that part of the test was doing.

By(fmt.Sprintf("Getting pod %s/%s", dv.Namespace, cloneSourcePod))
pod, err := f.K8sClient.CoreV1().Pods(dv.Namespace).Get(context.TODO(), cloneSourcePod, metav1.GetOptions{})
Expect(err).ToNot(HaveOccurred())
podsNodeName[dv.Name] = pod.Spec.NodeName
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have the map be map[string]bool and on this line just say podsNodeName[pod.Spec.NodeName]=true and below check Expect(podsNodeName).To(HaveLen(1)) We are essentially creating a set of node names, and ensure there is only 1 node name. This demonstrates all the pods were on the same node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

@kubevirt-bot kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 13, 2023
Signed-off-by: Ido Aharon <iaharon@redhat.com>
@ido106 ido106 force-pushed the set_pod_affinity_for_host_assisted_clone branch from 8d0650e to 04fa501 Compare April 17, 2023 10:46
@kubevirt-bot kubevirt-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 17, 2023
@ido106 ido106 force-pushed the set_pod_affinity_for_host_assisted_clone branch from 99c4c3c to 920f0d7 Compare April 17, 2023 15:01
}
})

It("[rfe_id:1277][test_id:1899][crit:High][vendor:cnv-qe@redhat.com][level:component] Should allow multiple cloning operations in parallel for block devices", func() {
FIt("[rfe_id:1277][test_id:1899][crit:High][vendor:cnv-qe@redhat.com][level:component] Should allow multiple cloning operations in parallel for block devices", func() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the FIts the PR looks good to me!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops

By("Deleting verifier pod")
err = f.K8sClient.CoreV1().Pods(f.Namespace.Name).Delete(context.TODO(), utils.VerifierPodName, metav1.DeleteOptions{})
Expect(err).ToNot(HaveOccurred())
_, err = utils.WaitPodDeleted(f.K8sClient, utils.VerifierPodName, f.Namespace.Name, verifyPodDeletedTimeout)
Expect(err).ToNot(HaveOccurred())
}

// All pods should be in the same node except when the map is empty in smart clone
Copy link
Collaborator

@akalenyu akalenyu Apr 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to use cloneType in this context to tell you if this is going to be smart clone or not
For example cloneType == "network" means host assisted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I changed the condition to if cloneType == "network"

@ido106 ido106 force-pushed the set_pod_affinity_for_host_assisted_clone branch 3 times, most recently from 4739f57 to 1ba621e Compare April 17, 2023 15:25
@akalenyu
Copy link
Collaborator

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Apr 17, 2023
@akalenyu
Copy link
Collaborator

/test pull-containerized-data-importer-e2e-ceph

Same issue as in #2550,
maybe this change make it fail more often? in that case we might want to comment out Expect(restartCount).To(BeNumerically("<", 2)) with a TODO to remove once the issue is fixed

Signed-off-by: Ido Aharon <iaharon@redhat.com>
@ido106 ido106 force-pushed the set_pod_affinity_for_host_assisted_clone branch from 1ba621e to b79af1f Compare April 18, 2023 12:55
@kubevirt-bot kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Apr 18, 2023
@akalenyu
Copy link
Collaborator

/retest

@akalenyu
Copy link
Collaborator

/test pull-containerized-data-importer-e2e-upg

/cc @awels

@kubevirt-bot kubevirt-bot requested a review from awels April 21, 2023 09:08
@akalenyu
Copy link
Collaborator

forgot to lgtm again
/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Apr 27, 2023
@mhenriks
Copy link
Member

mhenriks commented May 1, 2023

/approve

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mhenriks

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 1, 2023
@awels
Copy link
Member

awels commented May 1, 2023

/lgtm

@kubevirt-bot kubevirt-bot merged commit 163f77a into kubevirt:main May 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Set pod affinity for host assisted clone source pods
5 participants