-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add documentation for cdi populators #2776
Conversation
Signed-off-by: Shelly Kagan <skagan@redhat.com>
doc/cdi-populators.md
Outdated
CDI volume populators are CDIs implementation of populating PVCs by importing/uploading/cloning data utilizing the new `dataSourceRef` field. New controllers and custom resources for each population method were introduced. | ||
|
||
**Values of the new API?** | ||
* Native synchronization with Kubernetes - this is kubernetes way of populating PVCs. Once PVC is bound we know it is populated (So far PVC was bound the moment the datavolume created it and the population progress was monitored via the datavolume) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before the PVC was bound the moment the datavolume created it, and the population progress was monitored via the datavolume
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before introducing populators, the PVC...
doc/cdi-populators.md
Outdated
* Can use one population definition for multiple PVCs - create 1 CR defining population source and use it for any PVC. | ||
* Better compatibility with existing backup solutions. Using PVC alone should solve all backup issues. Using datavolumes with populators solves most, for example Metro DR and [Gitops](https://www.redhat.com/en/topics/devops/what-is-gitops#:~:text=GitOps%20uses%20Git%20repositories%20as,set%20for%20the%20application%20framework.) - datavolume manifest will be applied, the datavolume will create the PVC that will bind immediately to the PV waiting for it. | ||
* Better compatibility with integration with [Kubevirt](https://github.com/kubevirt/kubevirt) and existing backup solutions, using VMs with PVCs using populators and VMs with datavolumetemplates. | ||
* Integration with [Kubevirt](https://github.com/kubevirt/kubevirt) with WFFC storage class is simpler not requiring [doppelganger pod](https://github.com/kubevirt/kubevirt/blob/main/docs/localstorage-disks.md#the-problem) for the start of the VM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe expand WFFC
into WaitForFirstConsumer(WFFC) so it is clear what the acronym means
doc/cdi-populators.md
Outdated
Once the temporary PVC population is done, the PV will be rebound to the original PVC completing the population process. | ||
|
||
#### Upload | ||
For upload need to follow the same guidelines as describe in the [upload doc](upload.md) but instead of creating a data volume you can create VolumeUploadSource CR and PVC similar to the import examples in this doc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would give a full example of upload as well, since you are already doing it for import and clone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree
doc/cdi-populators.md
Outdated
|
||
**Values of the new API?** | ||
* Native synchronization with Kubernetes - this is kubernetes way of populating PVCs. Once PVC is bound we know it is populated (So far PVC was bound the moment the datavolume created it and the population progress was monitored via the datavolume) | ||
* Use PVCs directly and get them populated without datavolumes mitigation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe mitigation -> involvement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually like the word I intended to use, which apparently is not mitigation but mediation
doc/cdi-populators.md
Outdated
* Use PVCs directly and get them populated without datavolumes mitigation. | ||
* Can use one population definition for multiple PVCs - create 1 CR defining population source and use it for any PVC. | ||
* Better compatibility with existing backup solutions. Using PVC alone should solve all backup issues. Using datavolumes with populators solves most, for example Metro DR and [Gitops](https://www.redhat.com/en/topics/devops/what-is-gitops#:~:text=GitOps%20uses%20Git%20repositories%20as,set%20for%20the%20application%20framework.) - datavolume manifest will be applied, the datavolume will create the PVC that will bind immediately to the PV waiting for it. | ||
* Better compatibility with integration with [Kubevirt](https://github.com/kubevirt/kubevirt) and existing backup solutions, using VMs with PVCs using populators and VMs with datavolumetemplates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compatibility with -> and?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its not exactly right but I will rephrase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for opening this, looks good! Some comments.
doc/cdi-populators.md
Outdated
For upload need to follow the same guidelines as describe in the [upload doc](upload.md) but instead of creating a data volume you can create VolumeUploadSource CR and PVC similar to the import examples in this doc. | ||
|
||
#### Clone | ||
Same for clone very similar API: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sound odd, maybe Example of a PVC that will be populated by clone-populator:
doc/cdi-populators.md
Outdated
Once the temporary PVC population is done, the PV will be rebound to the original PVC completing the population process. | ||
|
||
#### Upload | ||
For upload need to follow the same guidelines as describe in the [upload doc](upload.md) but instead of creating a data volume you can create VolumeUploadSource CR and PVC similar to the import examples in this doc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree
doc/cdi-populators.md
Outdated
What are [populators](https://kubernetes.io/blog/2022/05/16/volume-populators-beta/) | ||
|
||
## Introduction | ||
CDI volume populators are CDIs implementation of populating PVCs by importing/uploading/cloning data utilizing the new `dataSourceRef` field. New controllers and custom resources for each population method were introduced. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't know if this is incorrect but my spanish-talking brain tells me to change populating
for a noun in implementation of populating PVCs
.
Maybe CDI volume populators are CDI's implementation for the population of PVCs by...
sounds good too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for me it sounds fine :) I prefer leave it like this unless someone says its wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or "CDI volume populators are CDI's implementation of the existing import, upload, and clone operations using the new dataSourceRef field of the PVC."
doc/cdi-populators.md
Outdated
|
||
The integration of datavolumes and CDI populators is seamless. You can create the datavolumes the same way you always have. | ||
If the used storage class is CSI storage then the datavolume population will occur via the CDI populators with the end result as always has been of a populated PVC. You will be able to notice that the created PVC will stay pending until the population process completes. | ||
For more information of using datavolumes for population check the [datavolume doc](datavolumes.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a strict list of requirements for populators?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after all of the PRs I stated in the description there will be no requirements other then the CSI storage which I mentioned.
doc/cdi-populators.md
Outdated
## Introduction | ||
CDI volume populators are CDIs implementation of populating PVCs by importing/uploading/cloning data utilizing the new `dataSourceRef` field. New controllers and custom resources for each population method were introduced. | ||
|
||
**Values of the new API?** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why ?
? I would rename this to something like Motivation
or Benefits of using populators
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:) I started with why using the new API
and changed it. I'll remove the ?
and maybe change values to benefits
### Using populators with PVCs | ||
User can create a CR and PVCs specifying the CR in the `DataSourceRef` field and those will be handled by the matching populator controller. | ||
|
||
#### Import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be more specific with each example: We should add an example of a valid CRD and a valid PVC for each populator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have it for every example except upload which Ill add
doc/cdi-populators.md
Outdated
What are [populators](https://kubernetes.io/blog/2022/05/16/volume-populators-beta/) | ||
|
||
## Introduction | ||
CDI volume populators are CDIs implementation of populating PVCs by importing/uploading/cloning data utilizing the new `dataSourceRef` field. New controllers and custom resources for each population method were introduced. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or "CDI volume populators are CDI's implementation of the existing import, upload, and clone operations using the new dataSourceRef field of the PVC."
doc/cdi-populators.md
Outdated
|
||
**Benefits of the new API** | ||
* Native synchronization with Kubernetes - this is Kubernetes way of populating PVCs. Once PVC is bound we know it is populated (Before introducing populators, the PVC was bound the moment the datavolume created it, and the population progress was monitored via the datavolume) | ||
* Use PVCs directly and get them populated without datavolumes mediation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is now possible to use PVCs directly and have them populated without the need for DataVolumes.
**Benefits of the new API** | ||
* Native synchronization with Kubernetes - this is Kubernetes way of populating PVCs. Once PVC is bound we know it is populated (Before introducing populators, the PVC was bound the moment the datavolume created it, and the population progress was monitored via the datavolume) | ||
* Use PVCs directly and get them populated without datavolumes mediation. | ||
* Can use one population definition for multiple PVCs - create 1 CR defining population source and use it for any PVC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may make sense to mention that this is limited to a single namespace until https://kubernetes.io/blog/2023/01/02/cross-namespace-data-sources-alpha/ goes beta and is incorporated into CDI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mhenriks didn't you solve that for now by using the tokens or I miss understood?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cross namespace only works with datavolume integration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we have to mention this limitation since currently alpha
doc/cdi-populators.md
Outdated
* Native synchronization with Kubernetes - this is Kubernetes way of populating PVCs. Once PVC is bound we know it is populated (Before introducing populators, the PVC was bound the moment the datavolume created it, and the population progress was monitored via the datavolume) | ||
* Use PVCs directly and get them populated without datavolumes mediation. | ||
* Can use one population definition for multiple PVCs - create 1 CR defining population source and use it for any PVC. | ||
* Better compatibility with existing backup solutions. Using PVC alone should solve all backup issues. Using datavolumes with populators solves most, for example Metro DR and [Gitops](https://www.redhat.com/en/topics/devops/what-is-gitops#:~:text=GitOps%20uses%20Git%20repositories%20as,set%20for%20the%20application%20framework.) - datavolume manifest will be applied, the datavolume will create the PVC that will bind immediately to the PV waiting for it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
backup and disaster recovery solutions
doc/cdi-populators.md
Outdated
* Use PVCs directly and get them populated without datavolumes mediation. | ||
* Can use one population definition for multiple PVCs - create 1 CR defining population source and use it for any PVC. | ||
* Better compatibility with existing backup solutions. Using PVC alone should solve all backup issues. Using datavolumes with populators solves most, for example Metro DR and [Gitops](https://www.redhat.com/en/topics/devops/what-is-gitops#:~:text=GitOps%20uses%20Git%20repositories%20as,set%20for%20the%20application%20framework.) - datavolume manifest will be applied, the datavolume will create the PVC that will bind immediately to the PV waiting for it. | ||
* Better compatibility with existing backup solutions and [Kubevirt](https://github.com/kubevirt/kubevirt) VMs with PVCs using populators and VMs with datavolumetemplates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this different from the above point? I'm not clear what this is saying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference is that the above point is only for using pvcs and datavolume without VMs, the second is VM using those PVCs and using datavolumetemplates.. maybe I can just continue the previous point with the mention of the VMs
doc/cdi-populators.md
Outdated
* Can use one population definition for multiple PVCs - create 1 CR defining population source and use it for any PVC. | ||
* Better compatibility with existing backup solutions. Using PVC alone should solve all backup issues. Using datavolumes with populators solves most, for example Metro DR and [Gitops](https://www.redhat.com/en/topics/devops/what-is-gitops#:~:text=GitOps%20uses%20Git%20repositories%20as,set%20for%20the%20application%20framework.) - datavolume manifest will be applied, the datavolume will create the PVC that will bind immediately to the PV waiting for it. | ||
* Better compatibility with existing backup solutions and [Kubevirt](https://github.com/kubevirt/kubevirt) VMs with PVCs using populators and VMs with datavolumetemplates. | ||
* Integration with [Kubevirt](https://github.com/kubevirt/kubevirt) with WaitForFirstConsumer(WFFC) storage class is simpler not requiring [doppelganger pod](https://github.com/kubevirt/kubevirt/blob/main/docs/localstorage-disks.md#the-problem) for the start of the VM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is simpler not requiring -> is simpler and does not require a
requests: | ||
storage: 10Gi | ||
``` | ||
After creating the VolumeUploadSource and PVC, you can start the upload to the pvc as describe in the [upload doc](upload.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgive my ignorance but does the user target PVC prime when uploading or the PVC they created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, the user can upload to the target pvc as always, we take care of the rest under the covers.
doc/cdi-populators.md
Outdated
### Using populators with DataVolumes | ||
|
||
The integration of datavolumes and CDI populators is seamless. You can create the datavolumes the same way you always have. | ||
If the used storage class is CSI storage then the datavolume population will occur via the CDI populators with the end result as always has been of a populated PVC. You will be able to notice that the created PVC will stay pending until the population process completes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the DataVolume targets a storage class that uses a CSI provisioner CDI will automatically use the new populators method. The behavior will be the same as always but with the following key differences. <different Pending DV status, and PVC will not become bound until is is populated>.
Signed-off-by: Shelly Kagan <skagan@redhat.com>
745e287
to
85e0d8a
Compare
Looks good to me! That said I think we should wait for another lgtm, preferably from some US folk |
/approve looks good now we need something for https://github.com/kubevirt/user-guide |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mhenriks The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest-required |
/retest |
/cherrypick release-v1.57 |
@ShellyKa13: new pull request created: #2812 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
The documentation already takes into account all the code that should get in for the cdi populators i.e the integration with the datavolume clone and populators #2750 , supporting Immediate Binding annotation #2765 , supporting VDDK and Imagio sources #2767
Release note: