Improve failure recovery during online upgrade #857

roypaulin · 2024-07-12T16:19:02Z

With this:

The operator waits until all replica subcluter are sandboxed
Makes sure there is an initiator in both main cluster and sandbox before replication
Waits for all the nodes in sandbox to be up before promotion

This pulls in the latest vclusterops library changes. Many of the changes that we are pulling in were changes related to improving the vcluster CLI.

This pulls in the new signature for VStartDatabase. A new parameter was returned, which we can safely ignore for the operators usage.

During a replicated upgrade, we must categorize the subclusters into one of two replica groups. Initially, all subclusters participating in the upgrade will be placed in replica group A. Subsequently, new subclusters will be generated throughout the process and assigned to the second group, named replica group B. This change manages the assignment of subclusters at the beginning of the upgrade process. To facilitate this change, new status fields will be introduced to the VerticaDB, enabling the tracking of this state across reconcile iterations.

This adds a new controller for sandbox. It is going to watch a ConfigMap that has labels on it to indicate it contains state for a sandbox. That ConfigMap also contains the vdb name whose subclusters are involved in sandbox/unsanbox. For now it simply reads the VerticaDB found in the configMap and do nothing else. Some changes are also made to the VerticaDB API to add new fields related to sanbox

This PR eliminates labels from the `VerticaReplicator` sample config. The intention is to enhance the user experience on OpenShift when generating this CR from the webUI. OpenShift utilizes this sample to prepopulate fields in the webUI interface.

This adds a few webhook rules for replicated upgrade. It's probably not a complete list, but we can always add more later when we find rules to check. This commit adds the following: - ensures the subclusters involved in the upgrade are stable. They cannot be removed and the sizes can't change. - We do allow new subclusters to be added to one of the replica groups. But they must be secondary subclusters. No new primaries. - ensures that subclusters listed in the `.status.updateState.replicaGroups` aren't repeated.

We now collect the sandbox name in podfacts. We updated the vsql query to return the extra state. We built an interface to get the information as we intend to use a REST endpoint in the future but still need support via vsql for older releases.

This adds a new leg, e2e-leg-9, to the CI. This is going to be used to test changes done in 24.3.0. One new change with this is that it will use a vertica license so that we can test scaling past 3 nodes. This will be required in order to test out the new replicated upgrade work.

#764) `vrep` reconciler enforces the minimum source and target db versions to be at least 24.3.0 --------- Co-authored-by: Roy Paulin <rnguetsopken@opentext.com> Co-authored-by: spilchen <mspilchen@opentext.com>

This PR adds below webhooks rules for sandboxes in CRD: - cannot scale (up or down) any subcluster that is in a sandbox - cannot remove a subcluster that is sandboxed. It must be unsandboxed first. - cannot have multiple sandboxes with the same name - cannot have the image of a sandbox be different than the main cluster before the sandbox has been setup - cannot be used on versions older than 24.3.0 - cannot be used before the database has been initialized - cannot have duplicate subclusters defined in a sandbox - cannot have a subcluster defined in multiple sandboxes - cannot have a non-existing subcluster defined in a sandbox We could add more rules later for sandboxes when we have needs.

we want to use restart reconciler in both the VerticaDB controller and the sandbox controller. This makes some changes to the restart reconciler in order to achieve that. The sandbox name (empty string if VerticaDB controller) is passed to the reconciler and it uses it to target only pods belonging to that sandbox( or to target only pods belonging to the main cluster if sandbox name is empty)

This change adds stubs to the replicated upgrade reconciler. It attempts to have the functions present that are needed to do the upgrade. Due to dependencies, such as sandboxing and replication, comprehensive testing is currently limited to unit tests. While this logic might evolve during integration, it gives a framework for how the upgrade should function. Additionally, I identified new requirements for state management. Two new annotations have been introduced to the VerticaDB: - vertica.com/replicated-upgrade-sandbox: to record the name of the sandbox created for the upgrade - vertica.com/replicated-upgrade-replicator-name: to track the name of the VerticaReplicator CR utilized for replicating data to the sandbox.

This PR adds a new webhook rule for sandboxes in CRD: cannot have a primary subcluster defined in a sandbox.

This adds a new fetcher in podfacts to fetch node details from the running database inside the pod. This new fetcher will call VClusterOps API which will send an HTTP request to the database. The new fetcher will be enabled when VclusterOps annotation is set, and Vertica server version is not older than v24.3.0. The new fetcher should be faster and more reliable than the old fetcher which will execute vsql within the pod.

In the replicated upgrade process, we must pause and redirect client connections from subclusters in replica group A to those in replica group B. This is the initial stage of the change. Currently, pause/redirect semantics are not supported, as they require new server modifications that have not yet been implemented. Therefore, we will perform an old-style drain to pause the process and maintain service labels correctly to point to subclusters in replica group B for redirection. --------- Co-authored-by: Roy Paulin <rnguetsopken@opentext.com>

This change will modify the upgrade reconcilers, both offline and online, to function in either the main cluster or a sandbox. Our immediate plan is to use the offline upgrade reconciler within the sandbox controller, although this will be done in a follow-on task.

This pull request modifies the Vertica replicator controller to invoke the replication command synchronously. Additionally, it introduces new status conditions within VerticaReplicator to monitor the controller's various states. --------- Co-authored-by: spilchen <mspilchen@opentext.com>

If we had selected an upgrade method but ended up failing back to a different method because something was incompatible, we are going to log an event. For instance, if we requested something other than offline upgrade, but we end up in the offline upgrade code path we will have an event message like this: ``` Normal IncompatibleUpgradeRequested 7m20s verticadb-operator Requested upgrade is incompatible with the Vertica deployment. Falling back to offline upgrade. ```

This PR adds sandbox reconciler to VerticaDB controller. The reconciler will add subclusters to sandboxes in the database. The change of sandbox fields in CRD will trigger sandbox reconciler, then the reconciler will call vclusterOps to add subclusters to sandboxes.

This will add a status message to the `.status.upgradeStatus` field in the VerticaDB as the operator goes through the various stages of replicated upgrade. It reuses the same design for the other upgrade methods in that we only advance the status message forward. This means that we need to reconcile an iteration, we won't report the first status message again.

Automated changes by pre-release.yml GitHub action Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Matt Spilchen <mspilchen@opentext.com> Co-authored-by: spilchen <spilchen@users.noreply.github.com>

The latest vclusterOps improves the promote_sandbox API to clean communal storage after sandbox promotion. This PR simply updates vclusterOps library to clean communal storage in the online upgrade.

After promoting sandbox to main, we need to delete sandbox config map to avoid any kind of conflicts as the sandbox does not exist anymore.

The new upgrade policy will be `Online` instead of `Replicated`. What was formerly called `Online` is now `ReadOnlyOnline` upgrade

Latest vclusterops fix a bug that would incorrectly trim nodes from sandbox clusters if not included in --node-names, when --node-names is specified

Signed-off-by: GitHub <noreply@github.com> Co-authored-by: Matt Spilchen <mspilchen@opentext.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: spilchen <spilchen@users.noreply.github.com> Co-authored-by: roypaulin <roypaulin@users.noreply.github.com>

Because we cannot reuse a sandbox name after promotion, a uuid is appended for each online upgrade run. The operator will first try to get the sandbox name from an annotation(useful for testing). If the annotation is not set, it will build a name that contains a uid.

…ade (#854) When you do back to back online upgrade, the 2nd time, the new sts names contain the subcluster to mimic names. If any of the names contains an underscore, the sts name will be invalid. This PR converts underscore to hyphen to prevent that.

cchen-vertica · 2024-07-15T18:27:11Z

pkg/controllers/vdb/onlineupgrade_reconciler.go

+// and the sandbox, to perform replication.
+// Once we start using services for replication, we will check only the scs served by the services.
+func (r *OnlineUpgradeReconciler) prepareReplication(ctx context.Context) (ctrl.Result, error) {
+	// Skip if the VerticaReplicator has already been created.


Add if we already promoted sandbox in the beginning of the function:
if vmeta.GetOnlineUpgradeSandboxPromoted(r.VDB.Annotations) == vmeta.SandboxPromotedTrue {
return ctrl.Result{}, nil
}

cchen-vertica · 2024-07-15T18:48:20Z

pkg/controllers/vdb/onlineupgrade_reconciler.go

+	}
+	// we need fresh podfacts just in case some pods went down while
+	// sandbox upgrade was in progress.
+	r.PFacts[vapi.MainCluster].Invalidate()


Should we invalidate sandbox pfacts too? Don't know how long the draining_connection will take. If the sandbox died after the upgrade but before replication, replication could fail, then we will not do replication any more.

I already did it. check out getSandboxPodFacts()

This PR fixed re-ip error when restart the cluster by checking all pods are up and NMA is running in all pods. Without the check, vclusterOps re-ip could fail when NMA is not running or half of the primary nodes are not up.

cchen-vertica · 2024-07-22T21:29:20Z

tests/e2e-leg-9/new-online-upgrade-pods-down/36-kill-pods-in-main.yaml

+kind: TestStep
+commands:
+  - command: kubectl delete pod v-base-upgrade-main-0 v-base-upgrade-main-1 v-base-upgrade-main-2
+    namespaced: true


We should add one more step(37) after this one to verify the pods in main are restarted.

cchen-vertica · 2024-07-22T21:35:51Z

pkg/meta/annotations.go

+func GetOnlineUpgradeStepInx(annotations map[string]string) int {
+	return lookupIntAnnotation(annotations, OnlineUpgradeStepInxAnnotation, 0)
+}
+
 // GetOnlineUpgradeReplicaARemoved returns if replica A has been removed in online upgrade.
 func GetOnlineUpgradeReplicaARemoved(annotations map[string]string) string {


This function should be removed.

cchen-vertica · 2024-07-22T21:37:03Z

pkg/meta/annotations.go

+	// During online upgrade, this annotation store some steps that we have
+	// already passed. This will allow us to skip them.
+	OnlineUpgradeStepInxAnnotation = "vertica.com/online-upgrade-step-index"
+
 	// During online upgrade, we store an annotation in the VerticaDB to indicate
 	// that we have removed old-main-cluster/replica-group-A.
 	OnlineUpgradeReplicaARemovedAnnotation = "vertica.com/online-upgrade-replica-A-removed"


Remove this annotation.

cchen-vertica · 2024-07-22T21:40:45Z

pkg/meta/annotations.go

@@ -272,6 +272,10 @@ const (
 	SandboxPromotedTrue                    = "true"


Remove the annotation above this line.

cchen-vertica · 2024-07-22T21:50:27Z

pkg/controllers/vdb/onlineupgrade_reconciler.go

 }

 // runAddNodesReconcilerForMainCluster will run the reconciler to scale out any subclusters.
 func (r *OnlineUpgradeReconciler) runAddNodesReconcilerForMainCluster(ctx context.Context) (ctrl.Result, error) {
 	// If we have already promoted sandbox to main, we don't need to touch old main cluster
-	if r.VDB.IsOnlineUpgradeSandboxPromoted() {
+	if vmeta.GetOnlineUpgradeStepInx(r.VDB.Annotations) > addNodeInx {


update the comment for this line of code

cchen-vertica · 2024-07-22T21:56:30Z

pkg/controllers/vdb/onlineupgrade_reconciler.go

@@ -1149,7 +1205,7 @@ func (r *OnlineUpgradeReconciler) removeReplicaGroupA(ctx context.Context) (ctrl
 	if verrors.IsReconcileAborted(res, err) {


At the beginning of this function, add the condition:
if vmeta.GetOnlineUpgradeStepInx(r.VDB.Annotations) > removeReplicaGroupA {
return ctrl.Result{}, nil
}
It can prevent the index to be larger than 4.

cchen-vertica · 2024-07-24T16:48:46Z

Looks good to me. After the revive db issue is fixed, I will approve this one.

CLAassistant · 2024-07-25T03:07:18Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
5 out of 6 committers have signed the CLA.

✅ jizhuoyu
✅ chinhtranvan
✅ cchen-vertica
✅ roypaulin
✅ fenic-fawkes
❌ spilchen

spilchen seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

roypaulin · 2024-07-27T02:50:11Z

Looks good to me. After the revive db issue is fixed, I will approve this one.

Can you approve this?

cchen-vertica

Why there are 86 commits in this PR?

With this: - The operator waits until all replica subcluater are sandboxed - Makes sure there is an initiator in both main cluster and sandbox before replication - Waits for all the nodes in sandbox to be up before promotion Conversion happened in #857

spilchen and others added 30 commits March 21, 2024 14:37

VCluster API Refresh (#741)

e3780ce

This pulls in the latest vclusterops library changes. Many of the changes that we are pulling in were changes related to improving the vcluster CLI.

Merge branch 'main' into vnext

5fe8480

VerticaReplicator Skeleton API and Controller (#740)

3cd7e32

Add replicated upgradePolicy (#743)

7d60ec3

Merge branch 'main' into vnext

ba19325

VCluster API Refresh

c857fa1

This pulls in the new signature for VStartDatabase. A new parameter was returned, which we can safely ignore for the operators usage.

Merge branch 'main' into vnext

9a52d24

Merge branch 'main' into vnext

447ee00

Collect sandbox name in podfacts (#762)

cb1cc81

We now collect the sandbox name in podfacts. We updated the vsql query to return the extra state. We built an interface to get the information as we intend to use a REST endpoint in the future but still need support via vsql for older releases.

vrep reconciler enforces minimum version of source and target vdbs (

03f626b

#764) `vrep` reconciler enforces the minimum source and target db versions to be at least 24.3.0 --------- Co-authored-by: Roy Paulin <rnguetsopken@opentext.com> Co-authored-by: spilchen <mspilchen@opentext.com>

Replicated upgrade: create secondaries based on primaries (#765)

8149ce2

Add a new webhook rule for sandboxes (#770)

5510318

This PR adds a new webhook rule for sandboxes in CRD: cannot have a primary subcluster defined in a sandbox.

Merge branch 'main' into vnext

91fce6b

Change operator version to 2.2.0 (#784)

0c78877

Automated changes by pre-release.yml GitHub action Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Matt Spilchen <mspilchen@opentext.com> Co-authored-by: spilchen <spilchen@users.noreply.github.com>

Merge branch 'main' into vnext

af4e5c3

cchen-vertica and others added 9 commits June 25, 2024 12:21

Applied the latest vclusterOps library (#842)

530983d

The latest vclusterOps improves the promote_sandbox API to clean communal storage after sandbox promotion. This PR simply updates vclusterOps library to clean communal storage in the online upgrade.

Remove sandbox configmap after promotion (#843)

6522643

After promoting sandbox to main, we need to delete sandbox config map to avoid any kind of conflicts as the sandbox does not exist anymore.

Change Replicated upgrade policy to Online (#841)

9ca36f4

The new upgrade policy will be `Online` instead of `Replicated`. What was formerly called `Online` is now `ReadOnlyOnline` upgrade

Update vcluster (#844)

975db37

Latest vclusterops fix a bug that would incorrectly trim nodes from sandbox clusters if not included in --node-names, when --node-names is specified

Recover from pod failure

03e4ce4

Update test

7b33690

roypaulin requested a review from cchen-vertica as a code owner July 12, 2024 16:19

cchen-vertica reviewed Jul 15, 2024

View reviewed changes

roypaulin and others added 5 commits July 16, 2024 18:24

Fix e2e test

e666ecd

Fixed re-ip error when restart the cluster (#856)

9de7c05

This PR fixed re-ip error when restart the cluster by checking all pods are up and NMA is running in all pods. Without the check, vclusterOps re-ip could fail when NMA is not running or half of the primary nodes are not up.

Merge remote-tracking branch 'origin/vnext' into roypaulin/upgrade

f49874b

Fix e2e test

9620b31

Remove unused constant

3887a61

cchen-vertica reviewed Jul 22, 2024

View reviewed changes

roypaulin added 2 commits July 23, 2024 04:07

Address comments

8a5cc3b

Update vcluster

64796bd

cchen-vertica force-pushed the vnext branch from 5bdb36b to 29db384 Compare July 24, 2024 19:38

Base automatically changed from vnext to main July 26, 2024 18:57

Merge remote-tracking branch 'origin/main' into roypaulin/upgrade

021d296

cchen-vertica approved these changes Jul 29, 2024

View reviewed changes

cchen-vertica requested changes Jul 29, 2024

View reviewed changes

roypaulin mentioned this pull request Jul 29, 2024

Improve failure recovery during online upgrade #877

Merged

roypaulin closed this Jul 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve failure recovery during online upgrade #857

Improve failure recovery during online upgrade #857

roypaulin commented Jul 12, 2024

cchen-vertica Jul 15, 2024

cchen-vertica Jul 15, 2024

roypaulin Jul 16, 2024

cchen-vertica Jul 22, 2024

cchen-vertica Jul 22, 2024

cchen-vertica Jul 22, 2024

cchen-vertica Jul 22, 2024 •

edited

Loading

cchen-vertica Jul 22, 2024

cchen-vertica Jul 22, 2024

cchen-vertica commented Jul 24, 2024

CLAassistant commented Jul 25, 2024 •

edited

Loading

roypaulin commented Jul 27, 2024

cchen-vertica left a comment

		@@ -1149,7 +1205,7 @@ func (r *OnlineUpgradeReconciler) removeReplicaGroupA(ctx context.Context) (ctrl
		if verrors.IsReconcileAborted(res, err) {

Improve failure recovery during online upgrade #857

Improve failure recovery during online upgrade #857

Conversation

roypaulin commented Jul 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cchen-vertica Jul 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cchen-vertica commented Jul 24, 2024

CLAassistant commented Jul 25, 2024 • edited Loading

roypaulin commented Jul 27, 2024

cchen-vertica left a comment

Choose a reason for hiding this comment

cchen-vertica Jul 22, 2024 •

edited

Loading

CLAassistant commented Jul 25, 2024 •

edited

Loading