Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ETCD-610: automated backups no config #1646

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Elbehery
Copy link
Contributor

This PR add enhancement proposal for Etcd Automated Backups No Config.

Resolves https://issues.redhat.com/browse/ETCD-610

cc @openshift/openshift-team-etcd

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 17, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jun 17, 2024

@Elbehery: This pull request references ETCD-610 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

This PR add enhancement proposal for Etcd Automated Backups No Config.

Resolves https://issues.redhat.com/browse/ETCD-610

cc @openshift/openshift-team-etcd

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@Elbehery
Copy link
Contributor Author

/assign @hasbro17
/assign @tjungblu
/assign @dusk125
/assign @soltysh

- Need to agree on a default schedule.
- Need to agree on a default retention policy.

- Several options exist for the default PVCName.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnufied I would need your input here please

@Elbehery Elbehery force-pushed the etcd-automated-backup-no-config branch from 447cbfa to 9439c80 Compare June 17, 2024 22:05
Copy link
Contributor

openshift-ci bot commented Jun 17, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from hasbro17. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Elbehery Elbehery force-pushed the etcd-automated-backup-no-config branch 2 times, most recently from 035e9d5 to f54b7a3 Compare June 17, 2024 22:31

### User Stories
- As a cluster administrator I want cluster backup to be taken without configuration.
- As a cluster administrator I want to schedule recurring cluster backups so that I have a recent cluster state to recover from in the event of quorum loss (i.e. losing a majority of control-plane nodes).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth to settle on a cadence we think is a good default as part of the requirements. Maybe once a day?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I think once a day during mid-night is sufficient for most users .. I think mid-night should be a time where the cluster is not under load

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't assume that, but we can smear the schedule over the course of the day


- Several options exist for the default PVCName.
- Relying on `dynamic provisioning` is sufficient, however not an option for `SNO` or `BM` clusters.
- Utilising `local storage operator` is a proper solution, however installing a whole operator is too much overhead.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not hostPath as we do it in our e2e test?
https://github.com/openshift/cluster-etcd-operator/blob/master/test/e2e/backup_test.go#L444-L487

that should be portable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the hostPath mounts a path from the node' file system as a volume into the pod. my concerns

  • hostPath could have security impact as it exposes the node's filesystem.
  • there is no scheduling guarantee for the backup pod with using hostpath as is the case with localVolume.
  • Using localVolume, the node affinity with the PV will force the backup pod to be scheduled on a specific node, where the volume is attached.
  • localVolume allows using a separate disk as PV, unlike hostPath which mounts a folder from the node's FS.
  • localVolume is handled by the PV controller and the scheduler in different manner, in fact it was created to resolve issues with hostPath

That being said, if we were to use localVolume, we need to find a solution for balancing the backups across the master nodes, some ideas could be

  • create a PV for each master node using localVolume and the backup controller from CEO should take care of balancing the backups across the available and healthy volumes.
  • the controller should keep the most recent backup on a healthy node available for restoration.
  • the controller should skip an unhealthy master node from taking a backup

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you list your concerns and this solution as an alternative here in the doc as well, please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added these comments to the alternatives and Risks sections 👍🏽

@Elbehery Elbehery force-pushed the etcd-automated-backup-no-config branch 2 times, most recently from 16f8e08 to a40a86f Compare June 18, 2024 15:55

## Design Details

- Add default values to current `Etcdbackup Spec` using annotation.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@openshift/openshift-team-etcd I added this section after discussion on the Architecture Call

Input is from @soltysh @jsafrane

  • It is recommended to use default values on current API.
  • Do not rely on CVO on managing the default config.
  • Use storage solution based on OCP :-
    • Dynamic provisioning on Cloud based.
    • Local Volume for SNO / BM.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general flow should be like follows:

  1. All the EtcdBackup fields are defaulted, where possible. That's why I'm proposing to put this functionality behind already existing AutomatedEtcdBackup feature gate, rather than introducing a new one.
  2. The pvcName will be the one that will have to by populated automatically by the CEO, unless a users sets one. It will use:
  • the available default storage class in the cluster (storage team can provide you with where you should be looking at it);
  • if above is not available, fallback to using a localVolume, as explained by Hemant above;
  • add a warning condition to the CEO (at least for that localVolume), reporting that we're working off of default etcd backup, and that the cluster admin should verify correctness of that configuration.


In the event of the node becoming inaccessible or unscheduled, the recurring backups would not be scheduled. The periodic backup config would have to be recreated or updated with a different PVC that allows for a new PV to be provisioned on a node that is healthy.

If we were to use `localVolume`, we need to find a solution for balancing the backups across the master nodes, some ideas could be

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how would we restore from a localVolume? Is the data stored somewhere in /var/?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the localVolume is being mounted anywhere as specified in the Pod. In fact, you can mount a whole external drive.

iiuc, hostPath supports only mount points from the node's FS

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imagine we have to run the restore script, how would we get the local volume mounted into a recovery machine? would we create another static pod that would mount the localVolume?

I imagine the local storage plugin must be running along with the kubelet for that to work? @jsafrane / @gnufied maybe you guys can elaborate a bit how that works, I don't have much time this week to read through the CSI implementation 🐎

Copy link
Member

@gnufied gnufied Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pardon my ignorance for not being familiar with etcd deployment, but even if there is a default SC available, why are we backing up just one replica of etcd? I thought, we would want to backup all replicas - no? Such as - what if in case of network partition or something, the replica we are backing up is behind?

So IMO it sounds whether we are using LocalVolume or something else, we should always be creating backups across master nodes? Is that accurate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also to answer @tjungblu question. Local Volumes are a inbuilt feature of k8s and hence no additional plugin is necessary. Heck, a local PV can be provisioned statically and local-storage-operator is not necessary either, https://docs.openshift.com/container-platform/4.15/storage/persistent_storage/persistent_storage_local/persistent-storage-local.html#local-create-cr-manual_persistent-storage-local

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and can be mounted in the new etcd pod and the backup will exist.

check the restore script, you will need to unpack the snapshot before you can run any pod

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnufied in order to make sure the backups are accessible, and that we can round-robin the backups across all master nodes, which are up-to-date with the etcd-cluster leader

Is it possible to

  • Create StatefulSet across the master nodes, where the PVC template uses the localVolume to provision PV.
  • Since we need to do the backup according to a schedule as per EtcdBackupSpec, the issue to manage the backup pods within the STS using the CronJob
  • I am not aware if this is possible, but at least, we can generate Event from the CronJob and the STS could react to these event by taking a backup

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check the restore script, you will need to unpack the snapshot before you can run any pod

We can use init-container to unpack the backup, and then the etcd-pod can start ?

Otherwise, we can start a fresh etcd-pod and then run etcdctl restore ?

Please correct me if I am wrong

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Create StatefulSet across the master nodes, where the PVC template uses the localVolume to provision PV.

I believe you mean DaemonSet across the master nodes 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually thought about this as well, but there is no PersistentVolumeClaim template on DaemonSet.

Having a separate PVC & PV for each master node would be ideal for spreading the backups across the master nodes.

That being said, according to the latest update, we are going to use SideCar container && hostPath volume


If we were to use `localVolume`, we need to find a solution for balancing the backups across the master nodes, some ideas could be

- create a PV for each master node using `localVolume` and the backup controller from CEO should take care of balancing the backups across the `available` and `healthy` volumes.
Copy link

@tjungblu tjungblu Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would/could a DaemonSet do that for us?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

  • We can use DS and use node affinity to run the DS pods only on master nodes. However, I am not aware of how the storage will be handled in this case.

  • Is there is a way to keep the backups in round-robin fashion among all master nodes ?

  • Also what about STS ? wdyt ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, STS could also work, especially with the storage. Would also be useful to have this compared to the static pod/hostpath approach

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I think if we have STS, then the Pods, PVC, and PVs are being managed together

I think we can in this case create backups on RoundRobin manner across the master nodes.

For BM, I think we can still use local volume as storage option and being managed by STS

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this round robin backup work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the original enhancement - https://github.com/openshift/enhancements/blob/master/enhancements/etcd/automated-backups.md and I understand this now better. But still this begs the question - how do you know which snapshot to restore from? What if you take snapshot of a etcd replica which was behind and is just catching up with other master nodes? Is that possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we will make sure that the backup is taken from the a member whose log is identical to the leader. This way we make sure that we are not lagging behind.

But, could this approach work ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you want to handle the balancing? That's not what you should care for at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree, therefore the SideCar approach is the most appropriate

#### Cloud based OCP

##### Pros
- Utilising `Dynamic provisioning` is best option on cloud based OCP.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is the customer able to access a dynamically created PV to get their snapshots for restoration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gr8 question, so I have defaulted the RetentionPolicy in my sample configs to Retain.

This way the PV contents will never be erased automatically.

Now to answer the restoration

  • If the cluster still running, the PV can be attached to any node, as long as it is in the same availability zone.
  • If the cluster is completely down, the PV can be accessed by the CU, it will never be delete unless manually.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not about the retention, it's about the access and mounting possibilities

If the cluster still running, the PV can be attached to any node, as long as it is in the same availability zone.

that's quite a bad constraint for a disaster recovery procedure, isn't it? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:D :D .. I know, well the best option imho is to push backups to Remote storage option, then even in case of the whole cluster is down, a new installation can be restored using the remote backup.

However, in my discussion yesterday, the remote storage was not an option for BM. Also they recommended to keep it simple not too complicated

If it were to me, I would create a side car which pushes the backups to remote storage, I think this option would work for any OCP variant. The master nodes dont have to be in the cloud, the side car could authenticate and push backups regardless of OCP underlying infrastructure, wdyt

Copy link
Member

@gnufied gnufied Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If dyanmically provisioned volumes are available in the cluster, it is 100% likely that, cluster has Remote Storage and hence backups are available even if cluster is down.

As for availability zones concerns, yeah that is why, backing up into a PV is not enough. We should consider using CSI snapshots of those PVCs, so as snapshot of backups can be available across availability zones and in case file system on PV gets corrupt or something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually want to fall back to the localVolume approach, since having two separate approaches is harder to maintain.

I believe the localVolume should be sufficient. However, if we could utilize STS across all the master nodes would be ideal in situation where we lose one or more master nodes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I were to design this, I would definitely use two different backup strategies. It is far more reliable to use dynamically provisioned PVCs for backups. They can be snapshotted and can be accessible if node goes down.

It seems to me that - any hostpath/localvolume based approach is basically inferior. If last backup was taken on leader and then leader node goes down, then so does our backup(other nodes may have slightly behind backups). So hostpath/localvolume requires fundamentally different recovery mechanism.

IMO - if we take other components that use persistent storage in Openshift, if customers don't provide Storage then no persistent configuration is created. Prometheus or image-registry, they all require storage. And they don't do anything automatically.

So, what I am saying is - we should probably limit scope of this KEP only to environments where an StorageClass is available. If no StorageClass is configured then no automatic backups are taken.

We should not take upon ourselves to decide a storage strategy for customer. cc @tjungblu

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to summarize - I would simply not bother to configure backups via localvolume/hostpath in environments where no storage is available and solve that problem via documentation or ask customer to configure storage. I do not think, we should automatically configuring local-storage for these clusters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually agree that using dynamic provisioning and CSI snapshot is vital for the automated backup.

I also agree to limit the automated backup for clusters with dynamic provisioning only.

wdyt @tjungblu @hasbro17 @dusk125 @soltysh

@Elbehery Elbehery force-pushed the etcd-automated-backup-no-config branch 2 times, most recently from 8ad5f84 to 99044c1 Compare June 18, 2024 18:00
@Elbehery
Copy link
Contributor Author

Elbehery commented Jun 18, 2024

Also see this default I used to test the approach


apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: etcd-backup-local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: Immediate

---

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: etcd-backup-pvc
  namespace: openshift-etcd
spec:
  accessModes:
  - ReadWriteOnce
  volumeMode: Filesystem
  resources:
    requests:
      storage: 10Gi 
  storageClassName: etcd-backup-local-storage

---

apiVersion: v1
kind: PersistentVolume
metadata:
  name: etcd-backup-pv-fs
spec:
  capacity:
    storage: 100Gi 
  volumeMode: Filesystem
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: etcd-backup-local-storage
  local:
    path: /mnt
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:

---

apiVersion: operator.openshift.io/v1alpha1
kind: EtcdBackup
metadata:
  name: etcd-single-backup
  namespace: openshift-etcd
spec:
  pvcName: etcd-backup-pvc




@Elbehery Elbehery force-pushed the etcd-automated-backup-no-config branch from 99044c1 to 51fdbad Compare June 18, 2024 21:05
- Several options exist for the default `PVCName`.
- Relying on `dynamic provisioning` is sufficient, however not an option for `SNO` or `BM` clusters.
- Utilising `local storage operator` is a proper solution, however installing a whole operator is too much overhead.
- The most viable solution to cover all OCP variants is to use `local volume`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to necessarily install local-storage-operator to use/consume local-storage. For simple one-off use cases like this, it may be possible that etcd-backup operator or something creates static-pv as documented - https://docs.openshift.com/container-platform/4.15/storage/persistent_storage/persistent_storage_local/persistent-storage-local.html#local-create-cr-manual_persistent-storage-local and uses it to perform etcd backups.



### Workflow Description
- The user will enable the AutomatedBackupNoConfig feature gate (under discussion).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that AutomatedEtcdBackup is still a tech preview feature, I'd consider expanding that functionality with the default configuration, rather than adding another one. This will save a lot of problems for some of the fields in the API you're planning to provide default values for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I will fix this 👍🏽

#### Standalone Clusters
TBD
#### Single-node Deployments or MicroShift
TBD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the strategy will be different for different topology, so I'd expect an outline how each topology will be handled wrt defaults.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we are planning to use SideCar Container within each etcd pod manifest, and the backup will be save to a hostPath volume.

iiuc in this approach the strategy will be the same regardless of the topology, or ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should, just make sure to call it out explicitly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're taking the exact same approach across all supported topologies, I'd at minimum add here links to the design details, pointing out that the solution is orthogonal to topology of the cluster.


In the event of the node becoming inaccessible or unscheduled, the recurring backups would not be scheduled. The periodic backup config would have to be recreated or updated with a different PVC that allows for a new PV to be provisioned on a node that is healthy.

If we were to use `localVolume`, we need to find a solution for balancing the backups across the master nodes, some ideas could be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Create StatefulSet across the master nodes, where the PVC template uses the localVolume to provision PV.

I believe you mean DaemonSet across the master nodes 😉


If we were to use `localVolume`, we need to find a solution for balancing the backups across the master nodes, some ideas could be

- create a PV for each master node using `localVolume` and the backup controller from CEO should take care of balancing the backups across the `available` and `healthy` volumes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you want to handle the balancing? That's not what you should care for at all.


## Design Details

- Add default values to current `Etcdbackup Spec` using annotation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general flow should be like follows:

  1. All the EtcdBackup fields are defaulted, where possible. That's why I'm proposing to put this functionality behind already existing AutomatedEtcdBackup feature gate, rather than introducing a new one.
  2. The pvcName will be the one that will have to by populated automatically by the CEO, unless a users sets one. It will use:
  • the available default storage class in the cluster (storage team can provide you with where you should be looking at it);
  • if above is not available, fallback to using a localVolume, as explained by Hemant above;
  • add a warning condition to the CEO (at least for that localVolume), reporting that we're working off of default etcd backup, and that the cluster admin should verify correctness of that configuration.


It supports all OCP variants, including `SNO` and `BM`.

However, I am strongly against using it for the following reasons
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
However, I am strongly against using it for the following reasons
However, the following reasons suggest against that solution:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole has been removed, radical changes :D :D

- No scheduling guarantees for the pod using `hostpath` as is the case with `localVolume`. The pod could be scheduled on a different node from where the hostPath volume exist.
- On the other hand, using `localVolume` and the node affinity within the PV manifest forces the backup pod to be scheduled on a specific node, where the volume is attached.
- `localVolume` allows using a separate disk as PV, unlike `hostPath` which mounts a folder from the node's FS.
- `localVolume` is handled by the PV controller and the scheduler in different manner, in fact it was created to resolve issues with `hostPath`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You forgot to add the locality of the backup. Iow. if it happens that backup is kept in the current leader, and that dies, we're loosing that data with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the current approach will take backups among all master nodes. Actually we have another issue now, how to distinguish which backup is most up-to-date since we have backups from all master nodes :)

@Elbehery
Copy link
Contributor Author

/label tide/merge-method-squash

@openshift-ci openshift-ci bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jun 27, 2024
This enhancement adds on previous work [automated backup of etcd](https://docs.openshift.com/container-platform/4.15/backup_and_restore/control_plane_backup_and_restore/backing-up-etcd.html#creating-automated-etcd-backups_backup-etcd).
Therefore, it is vital to use the same feature gate for both approaches. The following issues need to be clarified before implementation

* How to distinguish between `NoConfig` backups and configs that are triggered using `EtcdBackup` CR.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soltysh Would you kindly help me answering these questions ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will call it name="default" and in the controller we'll skip the reconciliation of that CRD

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if the sidecar is enabled we skip the backup using the controller, or we skip the default backups if the controllers is enabled ?

@Elbehery Elbehery force-pushed the etcd-automated-backup-no-config branch 2 times, most recently from 2004ffe to 90ed5b2 Compare June 28, 2024 00:42
@Elbehery Elbehery force-pushed the etcd-automated-backup-no-config branch 2 times, most recently from b00993a to 2c56179 Compare July 9, 2024 11:44
Copy link
Member

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the biggest feedback is the clear separation between automated backups and the user-configured automated backups. That's currently hidden in the implementation details. Having a clear boundary and explanation how one differs from the other, how one impacts the other is important.

### Goals

* Backups should be taken without configuration after cluster installation from day 1.
* Backups are saved to a default PersistentVolume, that could be overridden by user.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we discussed that users should override that value. We provide a best possible PV for the cluster, but in some cases that might mean ephemeral storage, which doesn't provide any guarantees. So it's best to make it explicit here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I am checking the Backup CR name and the periodicbackup controller ignores the CR with name default

This way we have both approaches alongside each other, and they both work independently

### Non-Goals

* Save cluster backups to remote cloud storage (e.g. S3 Bucket).
- This could be a future enhancement or extension to the API.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in one of our calls, the ability to guarantee a persistent storage for any supported cluster installation is very small. So I wouldn't even call out future extension. It's the cluster admin role to ensure that the PVCname is backed by a solid storage.


### API Extensions

No [API](https://github.com/openshift/enhancements/blob/master/enhancements/etcd/automated-backups.md#api-extensions) changes are required, since this approach work independently with default config.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you'll need to introduce the defaults in the API, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So i will create a default CR in CEO .. i will not use kubebuilder markers on the API

apiVersion: config.openshift.io/v1alpha1
kind: Backup
metadata:
  name: default
  annotations:
    default: "true"
spec:
  etcd:
    schedule: "20 4 * * *"
    timeZone: "UTC"
    retentionPolicy:
      retentionType: RetentionNumber
      retentionNumber:
        maxNumberOfBackups: 5

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you create this default CR, is this going to be in all clusters? Are we happy doing this for existing clusters?

Would it be better for the installer to create this object so that it's only created on new clusters? Con: We can't manage it to change it on existing clusters (though we probably don't want to)

#### Standalone Clusters
TBD
#### Single-node Deployments or MicroShift
TBD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're taking the exact same approach across all supported topologies, I'd at minimum add here links to the design details, pointing out that the solution is orthogonal to topology of the cluster.

Since the `SideCar` is being deployed alongside each etcd cluster member, it is possible to keep backups across all master nodes.

On the other hand, the backups may **not** be up-to-date since the snapshot might be lagging behind the `WAL`. Therefore, it is recommended to use this approach alongside the Automated Backup enabled using the `EtcdBackup` CR.
Since this work will be enabled with no configuration, it is possible to use define a default values for the `Scheule`, `Retention` independently.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is completely different from what I remember we talked about last time. I'd like you to change the initial part of this document to clearly explain this mechanism is orthogonal to the user-configured backups.


### Open Questions

This enhancement adds on previous work [automated backup of etcd](https://docs.openshift.com/container-platform/4.15/backup_and_restore/control_plane_backup_and_restore/backing-up-etcd.html#creating-automated-etcd-backups_backup-etcd).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the above statements they are two separate mechanisms, are they not?

@Elbehery Elbehery force-pushed the etcd-automated-backup-no-config branch from 2c56179 to 9a34c82 Compare July 30, 2024 19:48
@Elbehery Elbehery force-pushed the etcd-automated-backup-no-config branch from 9a34c82 to 57ed843 Compare July 30, 2024 20:34
Copy link
Contributor

openshift-ci bot commented Jul 30, 2024

@Elbehery: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.


## Proposal

To enable automated etcd backups of an Openshift cluster, a default backup option is to be taking sing a `SideCar` container within each etcd pod, with no configuration from user.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this document describe how the existing backups are taken? Or is there something that can be read to understand how that works today?


### API Extensions

No [API](https://github.com/openshift/enhancements/blob/master/enhancements/etcd/automated-backups.md#api-extensions) changes are required, since this approach work independently with default config.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you create this default CR, is this going to be in all clusters? Are we happy doing this for existing clusters?

Would it be better for the installer to create this object so that it's only created on new clusters? Con: We can't manage it to change it on existing clusters (though we probably don't want to)


* How to distinguish between `NoConfig` backups and configs that are triggered using `EtcdBackup` CR.
* The `NoConfig` backups relies on `EtcdBackup` CR with `default` name.
* The cluster-etcd-operator reacts to the `default` CR by deploying the backup sidecar containers alongside each etcd member.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need to be a sidecar? Is it going to run always? Could it not be a job? (I assume the existing backups happen via a job?)



* Upon enabling the `AutomatedBackup` feature gate, which approach should be used and according to what criteria.
* As the `Noconfig` backups is orthogonal to the [automated backup of etcd](https://docs.openshift.com/container-platform/4.15/backup_and_restore/control_plane_backup_and_restore/backing-up-etcd.html#creating-automated-etcd-backups_backup-etcd), it has been decided to use the same feature gate.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means you have to get both features ready and tested before either can be promoted. Strongly advise to use a separate gate



* How to disable the `NoConfig` behaviour, if a user want to.
* This has not been decided or implemented yet.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the installer created the object, and it was unmanaged, the user would just delete it


### Local Storage

Relying on local storage works for all Openshift variants (i.e. cloud based, SNO and BM). There are three possible approaches using local storage, each is detailed below.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On SNO, this doesn't seem particularly helpful. What is the recommendation going to be for SNO users?

* Pros
- It supports all Openshift variants, including `SNO` and `BM`.
* Cons
- `hostPath` could have security impact as it exposes the node's filesystem.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the other way around, are backups encrypted? What happens if someone gets hold of a backup by plucking it from a hostpath?

- It supports all Openshift variants, including `SNO` and `BM`.
* Cons
- `hostPath` could have security impact as it exposes the node's filesystem.
- No scheduling guarantees for the pod using `hostpath` as is the case with `localVolume`. The pod could be scheduled on a different node from where the hostPath volume exist.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a particular want, you could schedule the pods using the CEO to guarantee where they end up


As shown above, dynamic provisioning is the best storage solution to be used, but it is not viable among all Openshift variants.
However, a hybrid solution is possible, in which dynamic provisioning is being used in cloud based Openshift, while utilising local storage for BM and SNO.
Relying on `infrastructure` resource type, we can create the storage programmatically according to the underlying infrastructure and Openshift variant.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you plan to create cloud based PVs on the clouds?

### StatefulSet approach

A `StatefulSet` could be deployed among all master nodes, where each backup pod has its own `PV`. This approach has the pros of spreading the backups among all master nodes.
The complexity come from the fact that the backups are being triggered by a `CronJob` which spawn a `Job` to take the actual backup, by deploying a Pod.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CEO could add the nodeName to the podSpec in the Job when it creates it and handle this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants