Add documentation for cdi populators #2776

ShellyKa13 · 2023-06-27T13:00:50Z

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:
The documentation already takes into account all the code that should get in for the cdi populators i.e the integration with the datavolume clone and populators #2750 , supporting Immediate Binding annotation #2765 , supporting VDDK and Imagio sources #2767

Release note:

Add documentation for cdi populators

Signed-off-by: Shelly Kagan <skagan@redhat.com>

awels · 2023-06-27T13:03:45Z

doc/cdi-populators.md

+CDI volume populators are CDIs implementation of populating PVCs by importing/uploading/cloning data utilizing the new `dataSourceRef` field. New controllers and custom resources for each population method were introduced.
+
+**Values of the new API?**
+* Native synchronization with Kubernetes - this is kubernetes way of populating PVCs. Once PVC is bound we know it is populated (So far PVC was bound the moment the datavolume created it and the population progress was monitored via the datavolume)


Before the PVC was bound the moment the datavolume created it, and the population progress was monitored via the datavolume

Before introducing populators, the PVC...

awels · 2023-06-27T13:05:34Z

doc/cdi-populators.md

+* Can use one population definition for multiple PVCs - create 1 CR defining population source and use it for any PVC.
+* Better compatibility with existing backup solutions. Using PVC alone should solve all backup issues. Using datavolumes with populators solves most, for example Metro DR and [Gitops](https://www.redhat.com/en/topics/devops/what-is-gitops#:~:text=GitOps%20uses%20Git%20repositories%20as,set%20for%20the%20application%20framework.) - datavolume manifest will be applied, the datavolume will create the PVC that will bind immediately to the PV waiting for it.
+* Better compatibility with integration with [Kubevirt](https://github.com/kubevirt/kubevirt) and existing backup solutions, using VMs with PVCs using populators and VMs with datavolumetemplates.
+* Integration with [Kubevirt](https://github.com/kubevirt/kubevirt) with WFFC storage class is simpler not requiring [doppelganger pod](https://github.com/kubevirt/kubevirt/blob/main/docs/localstorage-disks.md#the-problem) for the start of the VM.


Maybe expand WFFC into WaitForFirstConsumer(WFFC) so it is clear what the acronym means

awels · 2023-06-27T13:06:52Z

doc/cdi-populators.md

+Once the temporary PVC population is done, the PV will be rebound to the original PVC completing the population process.
+
+#### Upload
+For upload need to follow the same guidelines as describe in the [upload doc](upload.md) but instead of creating a data volume you can create VolumeUploadSource CR and PVC similar to the import examples in this doc.


I would give a full example of upload as well, since you are already doing it for import and clone.

arnongilboa · 2023-06-27T13:21:51Z

doc/cdi-populators.md

+
+**Values of the new API?**
+* Native synchronization with Kubernetes - this is kubernetes way of populating PVCs. Once PVC is bound we know it is populated (So far PVC was bound the moment the datavolume created it and the population progress was monitored via the datavolume)
+* Use PVCs directly and get them populated without datavolumes mitigation.


maybe mitigation -> involvement?

I actually like the word I intended to use, which apparently is not mitigation but mediation

arnongilboa · 2023-06-27T13:24:44Z

doc/cdi-populators.md

+* Use PVCs directly and get them populated without datavolumes mitigation.
+* Can use one population definition for multiple PVCs - create 1 CR defining population source and use it for any PVC.
+* Better compatibility with existing backup solutions. Using PVC alone should solve all backup issues. Using datavolumes with populators solves most, for example Metro DR and [Gitops](https://www.redhat.com/en/topics/devops/what-is-gitops#:~:text=GitOps%20uses%20Git%20repositories%20as,set%20for%20the%20application%20framework.) - datavolume manifest will be applied, the datavolume will create the PVC that will bind immediately to the PV waiting for it.
+* Better compatibility with integration with [Kubevirt](https://github.com/kubevirt/kubevirt) and existing backup solutions, using VMs with PVCs using populators and VMs with datavolumetemplates.


compatibility with -> and?

its not exactly right but I will rephrase

alromeros

Thanks for opening this, looks good! Some comments.

alromeros · 2023-06-27T14:13:04Z

doc/cdi-populators.md

+For upload need to follow the same guidelines as describe in the [upload doc](upload.md) but instead of creating a data volume you can create VolumeUploadSource CR and PVC similar to the import examples in this doc.
+
+#### Clone
+Same for clone very similar API:


This sound odd, maybe Example of a PVC that will be populated by clone-populator:

alromeros · 2023-06-27T14:13:10Z

doc/cdi-populators.md

+Once the temporary PVC population is done, the PV will be rebound to the original PVC completing the population process.
+
+#### Upload
+For upload need to follow the same guidelines as describe in the [upload doc](upload.md) but instead of creating a data volume you can create VolumeUploadSource CR and PVC similar to the import examples in this doc.


alromeros · 2023-06-27T14:38:19Z

doc/cdi-populators.md

+What are [populators](https://kubernetes.io/blog/2022/05/16/volume-populators-beta/)
+
+## Introduction
+CDI volume populators are CDIs implementation of populating PVCs by importing/uploading/cloning data utilizing the new `dataSourceRef` field. New controllers and custom resources for each population method were introduced.


Don't know if this is incorrect but my spanish-talking brain tells me to change populating for a noun in implementation of populating PVCs.

Maybe CDI volume populators are CDI's implementation for the population of PVCs by... sounds good too.

for me it sounds fine :) I prefer leave it like this unless someone says its wrong

Or "CDI volume populators are CDI's implementation of the existing import, upload, and clone operations using the new dataSourceRef field of the PVC."

alromeros · 2023-06-27T14:41:23Z

doc/cdi-populators.md

+
+The integration of datavolumes and CDI populators is seamless. You can create the datavolumes the same way you always have.
+If the used storage class is CSI storage then the datavolume population will occur via the CDI populators with the end result as always has been of a populated PVC. You will be able to notice that the created PVC will stay pending until the population process completes.
+For more information of using datavolumes for population check the [datavolume doc](datavolumes.md)


Maybe add a strict list of requirements for populators?

after all of the PRs I stated in the description there will be no requirements other then the CSI storage which I mentioned.

alromeros · 2023-06-27T14:43:30Z

doc/cdi-populators.md

+## Introduction
+CDI volume populators are CDIs implementation of populating PVCs by importing/uploading/cloning data utilizing the new `dataSourceRef` field. New controllers and custom resources for each population method were introduced.
+
+**Values of the new API?**


Why ?? I would rename this to something like Motivation or Benefits of using populators.

:) I started with why using the new API and changed it. I'll remove the ? and maybe change values to benefits

alromeros · 2023-06-27T14:45:28Z

doc/cdi-populators.md

+### Using populators with PVCs
+User can create a CR and PVCs specifying the CR in the `DataSourceRef` field and those will be handled by the matching populator controller.
+
+#### Import


I think we should be more specific with each example: We should add an example of a valid CRD and a valid PVC for each populator.

I have it for every example except upload which Ill add

aglitke · 2023-06-28T18:46:28Z

doc/cdi-populators.md

+What are [populators](https://kubernetes.io/blog/2022/05/16/volume-populators-beta/)
+
+## Introduction
+CDI volume populators are CDIs implementation of populating PVCs by importing/uploading/cloning data utilizing the new `dataSourceRef` field. New controllers and custom resources for each population method were introduced.


Or "CDI volume populators are CDI's implementation of the existing import, upload, and clone operations using the new dataSourceRef field of the PVC."

aglitke · 2023-06-28T18:47:29Z

doc/cdi-populators.md

+
+**Benefits of the new API**
+* Native synchronization with Kubernetes - this is Kubernetes way of populating PVCs. Once PVC is bound we know it is populated (Before introducing populators, the PVC was bound the moment the datavolume created it, and the population progress was monitored via the datavolume)
+* Use PVCs directly and get them populated without datavolumes mediation.


It is now possible to use PVCs directly and have them populated without the need for DataVolumes.

aglitke · 2023-06-28T18:49:18Z

doc/cdi-populators.md

+**Benefits of the new API**
+* Native synchronization with Kubernetes - this is Kubernetes way of populating PVCs. Once PVC is bound we know it is populated (Before introducing populators, the PVC was bound the moment the datavolume created it, and the population progress was monitored via the datavolume)
+* Use PVCs directly and get them populated without datavolumes mediation.
+* Can use one population definition for multiple PVCs - create 1 CR defining population source and use it for any PVC.


It may make sense to mention that this is limited to a single namespace until https://kubernetes.io/blog/2023/01/02/cross-namespace-data-sources-alpha/ goes beta and is incorporated into CDI.

@mhenriks didn't you solve that for now by using the tokens or I miss understood?

Cross namespace only works with datavolume integration

I don't think we have to mention this limitation since currently alpha

aglitke · 2023-06-28T18:50:00Z

doc/cdi-populators.md

+* Native synchronization with Kubernetes - this is Kubernetes way of populating PVCs. Once PVC is bound we know it is populated (Before introducing populators, the PVC was bound the moment the datavolume created it, and the population progress was monitored via the datavolume)
+* Use PVCs directly and get them populated without datavolumes mediation.
+* Can use one population definition for multiple PVCs - create 1 CR defining population source and use it for any PVC.
+* Better compatibility with existing backup solutions. Using PVC alone should solve all backup issues. Using datavolumes with populators solves most, for example Metro DR and [Gitops](https://www.redhat.com/en/topics/devops/what-is-gitops#:~:text=GitOps%20uses%20Git%20repositories%20as,set%20for%20the%20application%20framework.) - datavolume manifest will be applied, the datavolume will create the PVC that will bind immediately to the PV waiting for it.


backup and disaster recovery solutions

aglitke · 2023-06-28T18:51:32Z

doc/cdi-populators.md

+* Use PVCs directly and get them populated without datavolumes mediation.
+* Can use one population definition for multiple PVCs - create 1 CR defining population source and use it for any PVC.
+* Better compatibility with existing backup solutions. Using PVC alone should solve all backup issues. Using datavolumes with populators solves most, for example Metro DR and [Gitops](https://www.redhat.com/en/topics/devops/what-is-gitops#:~:text=GitOps%20uses%20Git%20repositories%20as,set%20for%20the%20application%20framework.) - datavolume manifest will be applied, the datavolume will create the PVC that will bind immediately to the PV waiting for it.
+* Better compatibility with existing backup solutions and [Kubevirt](https://github.com/kubevirt/kubevirt) VMs with PVCs using populators and VMs with datavolumetemplates.


How is this different from the above point? I'm not clear what this is saying.

The difference is that the above point is only for using pvcs and datavolume without VMs, the second is VM using those PVCs and using datavolumetemplates.. maybe I can just continue the previous point with the mention of the VMs

aglitke · 2023-06-28T18:52:19Z

doc/cdi-populators.md

+* Can use one population definition for multiple PVCs - create 1 CR defining population source and use it for any PVC.
+* Better compatibility with existing backup solutions. Using PVC alone should solve all backup issues. Using datavolumes with populators solves most, for example Metro DR and [Gitops](https://www.redhat.com/en/topics/devops/what-is-gitops#:~:text=GitOps%20uses%20Git%20repositories%20as,set%20for%20the%20application%20framework.) - datavolume manifest will be applied, the datavolume will create the PVC that will bind immediately to the PV waiting for it.
+* Better compatibility with existing backup solutions and [Kubevirt](https://github.com/kubevirt/kubevirt) VMs with PVCs using populators and VMs with datavolumetemplates.
+* Integration with [Kubevirt](https://github.com/kubevirt/kubevirt) with WaitForFirstConsumer(WFFC) storage class is simpler not requiring [doppelganger pod](https://github.com/kubevirt/kubevirt/blob/main/docs/localstorage-disks.md#the-problem) for the start of the VM.


is simpler not requiring -> is simpler and does not require a

aglitke · 2023-06-28T18:55:02Z

doc/cdi-populators.md

+    requests:
+      storage: 10Gi
+```
+After creating the VolumeUploadSource and PVC, you can start the upload to the pvc as describe in the [upload doc](upload.md).


Forgive my ignorance but does the user target PVC prime when uploading or the PVC they created?

no, the user can upload to the target pvc as always, we take care of the rest under the covers.

aglitke · 2023-06-28T19:00:31Z

doc/cdi-populators.md

+### Using populators with DataVolumes
+
+The integration of datavolumes and CDI populators is seamless. You can create the datavolumes the same way you always have.
+If the used storage class is CSI storage then the datavolume population will occur via the CDI populators with the end result as always has been of a populated PVC. You will be able to notice that the created PVC will stay pending until the population process completes.


If the DataVolume targets a storage class that uses a CSI provisioner CDI will automatically use the new populators method. The behavior will be the same as always but with the following key differences. <different Pending DV status, and PVC will not become bound until is is populated>.

Signed-off-by: Shelly Kagan <skagan@redhat.com>

alromeros · 2023-07-10T15:44:13Z

Looks good to me! That said I think we should wait for another lgtm, preferably from some US folk
/lgtm

mhenriks · 2023-07-10T20:07:06Z

/approve

looks good now we need something for https://github.com/kubevirt/user-guide

kubevirt-bot · 2023-07-10T20:07:17Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mhenriks

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [mhenriks]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

mhenriks · 2023-07-10T20:07:30Z

/retest-required

ShellyKa13 · 2023-07-11T06:26:27Z

/retest

ShellyKa13 · 2023-07-19T08:09:44Z

/cherrypick release-v1.57

kubevirt-bot · 2023-07-19T08:19:16Z

@ShellyKa13: new pull request created: #2812

In response to this:

/cherrypick release-v1.57

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Add documentation for cdi populators

0f94657

Signed-off-by: Shelly Kagan <skagan@redhat.com>

kubevirt-bot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Jun 27, 2023

kubevirt-bot requested review from aglitke and maya-r June 27, 2023 13:00

kubevirt-bot added the size/M label Jun 27, 2023

awels reviewed Jun 27, 2023

View reviewed changes

arnongilboa reviewed Jun 27, 2023

View reviewed changes

alromeros reviewed Jun 27, 2023

View reviewed changes

kubevirt-bot added size/L and removed size/M labels Jun 28, 2023

aglitke reviewed Jun 28, 2023

View reviewed changes

populators doc updates after review

85e0d8a

Signed-off-by: Shelly Kagan <skagan@redhat.com>

ShellyKa13 force-pushed the populators-doc branch from 745e287 to 85e0d8a Compare June 29, 2023 12:59

kubevirt-bot assigned alromeros Jul 10, 2023

kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Jul 10, 2023

kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 10, 2023

kubevirt-bot merged commit e23ab12 into kubevirt:main Jul 11, 2023
16 checks passed

kubevirt-bot mentioned this pull request Jul 19, 2023

[release-v1.57] Add documentation for cdi populators #2812

Merged

Add documentation for cdi populators #2776

Add documentation for cdi populators #2776

Conversation

ShellyKa13 commented Jun 27, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alromeros left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alromeros commented Jul 10, 2023

mhenriks commented Jul 10, 2023

kubevirt-bot commented Jul 10, 2023

mhenriks commented Jul 10, 2023

ShellyKa13 commented Jul 11, 2023

ShellyKa13 commented Jul 19, 2023

kubevirt-bot commented Jul 19, 2023