-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datamover CRD design #597
Merged
dymurray
merged 11 commits into
openshift:master
from
savitharaghunathan:data_mover_enhancement
May 24, 2022
Merged
Datamover CRD design #597
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
a4b12d6
Adding initial datamover design
savitharaghunathan eb652f3
Typo in reviewers
savitharaghunathan b4e5d24
Add summary & user stories
savitharaghunathan 03a0883
Fixing nits
savitharaghunathan 52df8fb
Adding intial feedback
savitharaghunathan e127452
Adding feedback
savitharaghunathan 780d051
nit: changing the selector field name
savitharaghunathan f1c1ff0
Adding review comments
savitharaghunathan 7d66897
Adding plugin clarification
savitharaghunathan 645ef0e
updating image to reflect the latest design
savitharaghunathan d48ecfe
Reflect recent velero csi plugin changes
savitharaghunathan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,238 @@ | ||
--- | ||
title: datamover_crd_design | ||
authors: | ||
- "@savitharaghunathan" | ||
reviewers: | ||
- "@shawn_hurley" | ||
- "@alaypatel07" | ||
- "@dymurray" | ||
- "@eemcmullan" | ||
approvers: | ||
- "@shawn_hurley" | ||
- "@alaypatel07" | ||
creation-date: 2022-03-16 | ||
status: provisional | ||
--- | ||
|
||
# Data Mover CRD design | ||
|
||
## Release Signoff Checklist | ||
|
||
- [ ] Enhancement is `implementable` | ||
- [ ] Design details are appropriately documented from clear requirements | ||
- [ ] Test plan is defined | ||
- [ ] User-facing documentation is created | ||
|
||
## Open questions | ||
|
||
* PVC/VolumeSnapshot mover - Should the Datamover Backup process be triggered off a PVC or a snapshot? | ||
* Should we support both types and provide user an option to pick either PVC or snapshot? | ||
|
||
|
||
## Summary | ||
OADP operator currently supports backup and restore of applications backed by CSI volumes by leveraging the Velero CSI plugin. The problem with CSI snapshots on some providers such as ODF is that these snapshots are local to the Openshift cluster and cannot be recovered if the cluster gets deleted accidentally or if there is a disaster. In order to overcome this issue, DataMover is made available for users to save the snapshots in a remote storage. | ||
|
||
## Motivation | ||
|
||
Create an extensible design to support various data movers that can be integrated with OADP operator. Vendors should be able to bring their own data mover controller and implementation, and use that with OADP operator. | ||
|
||
## Goals | ||
* Create an extensible data mover solution | ||
* Supply a default data mover option | ||
* Supply APIs for DataMover CRs (eg: DataMoverBackup, DataMoverRestore, DataMoverClass) | ||
* Supply a sample codebase for the Data Mover plugin and controller implementation | ||
|
||
|
||
## Non Goals | ||
* Maintain 3rd party data mover implementations | ||
* Adding a status watch controller to Velero | ||
|
||
## User stories | ||
|
||
Story 1: | ||
As an application developer, I would like to save the CSI snaphots in a S3 bucket. | ||
|
||
Story 2: | ||
As a cluster admin, I would like to be able to restore CSI snapshots if disaster happens. | ||
|
||
## Design & Implementation details | ||
|
||
This design supports adding the data mover feature to the OADP operator and facilitates integrating various vendor implemented data movers. | ||
|
||
 | ||
|
||
Note: We will be supporting VolSync as the default data mover. | ||
|
||
The DataMoverBackup Controller will watch for DataMoverBackup CR. Likewise, DataMoverRestore Controller will watch for DataMoverRestore CR. Both of these CRs will have a reference to a DataMoverClass. | ||
|
||
`DataMoverClass` is a cluster scoped Custom Resource that will have details about the data mover. The specified mover will be registered in the system by creating the datamoverclass CR, addig a velero plugin that will create the appropriate resources for datamovement of a single datamoverclass and a controller that will reconcile the objects created by the plugin. The datamoverclass spec will also include a field (`selector`) to identify the PVCs that would be moved with the given data mover. | ||
|
||
|
||
``` | ||
apiVersion: oadp.openshift.io/v1alpha1 | ||
kind: DataMoverClass | ||
metadata: | ||
annotations: | ||
oadp.openshift.io/default: "true" | ||
name: <name> | ||
spec: | ||
mover: <VolSync> | ||
selector: <tagname> | ||
|
||
``` | ||
|
||
The above `DataMoverClass` name will be referenced in `DataMoverBackup` & `DataMoverRestore` CRs. This will help in selecting the data mover implementation during runtime. If the `DataMoverClass` name is not defined, then the default `DataMoverClass` will be used, which in this case will be `VolSync` | ||
|
||
### Data Mover Backup | ||
|
||
Assuming that the `DataMover Enable` flag is set to true in the DPA config, when a velero backup is created, it triggers the custom velero CSI plugin plugin (velero BackupItemAction plugin) to create the `DataMoverBackup` CR in the app namespace. The extended plugin looks up for the PVCs in the user namespace mentioned in the velero backup and creates a `DataMoverBackup` CR for every PVC in that namespace that is filtered by the `datamoverclass.spec.selector`. | ||
|
||
`DataMoverBackup` CR supports either a volumesnapshot or a pvc as the type of the backup object. If the velero CSI plugin is used for backup, `VolumeSnapshot` is used as the type or else `PVC` | ||
is used. | ||
|
||
``` | ||
apiVersion: oadp.openshift.io/v1alpha1 | ||
kind: DataMoverBackup | ||
metadata: | ||
name: <name> | ||
spec: | ||
dataMoverClass: <DataMoverClass name> | ||
dataSourceRef: | ||
apiGroup: <APIGroup> | ||
kind: <PVC|VolumeSnapshotContent> | ||
name: <name> | ||
config: //optional based on the datamover impl | ||
|
||
``` | ||
### Data Mover Restore | ||
When a velero restore is triggered, the custom Velero CSI plugin looks for `DataMoverBackup` in the backup resources. If it encounters a `DataMoverBackup` resource, then the extended plugin (velero RestoreItemAction plugin) will create a `DataMoverRestore` CR in the app namespace. It will populate the CR with the details obtained from the `DataMoverBackup` resource. | ||
|
||
``` | ||
apiVersion: oadp.openshift.io/v1alpha1 | ||
kind: DataMoverRestore | ||
metadata: | ||
name: <name> | ||
spec: | ||
DataMoverClass: <DataMoverClass name> | ||
destinationClaimRef: | ||
name: <PVC_claim_name> | ||
namespace: <namespace> | ||
config: //optional based on the datamover impl | ||
``` | ||
Config section in the above CR is optional. It lets the user specify extra parameters needed by the data mover. For eg: VolSync data mover needs restic secret to perform backup & restore | ||
|
||
eg: | ||
|
||
``` | ||
apiVersion: oadp.openshift.io/v1alpha1 | ||
kind: DataMoverRestore | ||
metadata: | ||
name: <name> | ||
spec: | ||
DataMoverClass: <DataMoverClass name> | ||
destinationClaimRef: | ||
name: <PVC_claim_name> | ||
namespace: <namespace> | ||
config: //optional based on the datamover impl | ||
resticSecret: | ||
name: <secret_name> | ||
``` | ||
|
||
We will provide a sample codebase which the vendors will be able to extend and implement their own data movers. | ||
|
||
|
||
### Default OADP Data Mover controller | ||
|
||
VolSync will be used as the default Data Mover for OADP and `restic` will be the supported method for backup & restore of PVCs. When OADP operator gets installed, VolSync will be installed alongside. Method of installation is TBD (Waiting on VolSync operator to be available. If not, we will do a helm install). Restic repository details are configured in a `secret` object which gets used by the VolSync's resources. This design takes advantage of VolSync's two resources - `ReplicationSource` & `ReplicationDestination`. `ReplicationSource` object helps with taking a backup of the PVCs and using restic to move it to the storage specified in the restic secret. `ReplicationDestination` object takes care of restoring the backup from the restic repository. There will be a 1:1 relationship between the replication src/dest CRs and PVCs. | ||
|
||
We will follow a two phased approach for implementation of this controller. For phase 1, the user will create a restic secret. Using that secret as source, the controller will create on-demand secrets for every backup/restore request. For phase 2, the user will provide the restic repo details. This may be an encryption password and BSL reference, and the controller will create restic secret using BSL info, or they can supply their own backup target repo and access credentials. We will be focussing on phase 1 approach for this design. | ||
|
||
``` | ||
... | ||
DataMoverEnable: True/False | ||
... | ||
``` | ||
|
||
If the DataMover flag is enabled, then the user creates a restic secret with all the following details, | ||
``` | ||
apiVersion: v1 | ||
kind: Secret | ||
metadata: | ||
name: restic-config | ||
type: Opaque | ||
stringData: | ||
# The repository url | ||
RESTIC_REPOSITORY: s3:s3.amazonaws.com/<bucket> | ||
# The repository encryption key | ||
RESTIC_PASSWORD: <password> | ||
# ENV vars specific to the chosen back end | ||
# https://restic.readthedocs.io/en/stable/030_preparing_a_new_repo.html | ||
AWS_ACCESS_KEY_ID: <access_id> | ||
AWS_SECRET_ACCESS_KEY: <access_key> | ||
``` | ||
*Note: More details for installing restic secret in [here](https://volsync.readthedocs.io/en/stable/usage/restic/index.html#specifying-a-repository)* | ||
|
||
|
||
Custom velero CSI plugin will be responsible for creating `DataMoverBackup` & `DataMoverRestore` CRs. | ||
|
||
Once a DataMoverBackup CR gets created, the controller will create the corresponding `ReplicationSource` CR in the protected namespace. VolSync watches for the creation of `ReplicationSource` CR and copies the PVC data to the restic repository mentioned in the `restic-secret`. | ||
``` | ||
apiVersion: volsync.backube/v1alpha1 | ||
kind: ReplicationSource | ||
metadata: | ||
name: database-source | ||
namespace: openshift-adp | ||
spec: | ||
sourcePVC: <pvc_name> | ||
trigger: | ||
manual: <trigger_name> | ||
restic: | ||
pruneIntervalDays: 15 | ||
repository: restic-config | ||
retain: | ||
hourly: 1 | ||
daily: 1 | ||
weekly: 1 | ||
monthly: 1 | ||
yearly: 1 | ||
copyMethod: None | ||
``` | ||
|
||
Similarly, when a DataMoverRestore CR gets created, controller will create a `ReplicationDestination` CR in the protected namespace. VolSync controller copies the PVC data from the restic repository to the protect namespace, which then gets transferred to the user namespace by the controller. | ||
|
||
``` | ||
apiVersion: volsync.backube/v1alpha1 | ||
kind: ReplicationDestination | ||
metadata: | ||
name: <protected_namespace> | ||
spec: | ||
trigger: | ||
manual: <trigger_name> | ||
restic: | ||
destinationPVC: <pvc_name> | ||
repository: restic-config | ||
copyMethod: None | ||
``` | ||
|
||
A status controller is created to watch VolSync CRs. It watches the `ReplicationSource` and`ReplicationDestination` objects and updates VolumeSnapShot CR events. | ||
|
||
*Note: Potential feature addition to Velero: A status watch controller for DataMover CRs. This can be used to update Velero Backup/Restore events with the DataMover CR results* | ||
|
||
Data mover controller will clean up all controller-created resources after the process is complete. | ||
|
||
|
||
### Support for multiple data mover plugins | ||
`DataMoverClass` spec will support the following field, | ||
`selector: <tagname>` | ||
PVC must be labelled with the `<tagname>`, to be moved by the specific `DataMoverClass`. User/Admin of the cluster must label the PVCs with the required `<tagname>` and map it to a `DataMoverClass`. If the PVCs are not labelled, it will be moved by the default datamover. | ||
|
||
#### Alternate options | ||
PVCs can be annotated with the `DataMoverClass`, and when a backup is created, the controller will look at the DataMoverClass and add it to the `DataMoverBackup` CR. | ||
|
||
|
||
--- | ||
|
||
## References | ||
Previous designs: | ||
* Alternate design - https://hackmd.io/uYrC2StuTT-zCSUSf4xlRw | ||
* Initial design - https://hackmd.io/8uEzXeD8TKCYF9uUdroDXA |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update this image path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I create a PR to just merge the image first? I think the path is correct. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me @ https://github.com/savitharaghunathan/oadp-operator/blob/data_mover_enhancement/docs/design/datamover.md
GitHub PR .md previews don't show (new??) images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaovilai I think so.