Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stateful application failover status injection feature gate #5897

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion api/openapi-spec/swagger.json
Original file line number Diff line number Diff line change
Expand Up @@ -18291,7 +18291,7 @@
"type": "string"
},
"statePreservation": {
"description": "StatePreservation defines the policy for preserving and restoring state data during failover events for stateful applications.\n\nWhen an application fails over from one cluster to another, this policy enables the extraction of critical data from the original resource configuration. Upon successful migration, the extracted data is then re-injected into the new resource, ensuring that the application can resume operation with its previous state intact. This is particularly useful for stateful applications where maintaining data consistency across failover events is crucial. If not specified, means no state data will be preserved.",
"description": "StatePreservation defines the policy for preserving and restoring state data during failover events for stateful applications.\n\nWhen an application fails over from one cluster to another, this policy enables the extraction of critical data from the original resource configuration. Upon successful migration, the extracted data is then re-injected into the new resource, ensuring that the application can resume operation with its previous state intact. This is particularly useful for stateful applications where maintaining data consistency across failover events is crucial. If not specified, means no state data will be preserved.\n\nNote: This requires the StatefulFailoverInjection feature gate to be enabled, which is alpha.",
"$ref": "#/definitions/com.github.karmada-io.karmada.pkg.apis.policy.v1alpha1.StatePreservation"
}
}
Expand Down
2 changes: 1 addition & 1 deletion artifacts/deploy/karmada-controller-manager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ spec:
- --cluster-status-update-frequency=10s
- --failover-eviction-timeout=30s
- --controllers=*,hpaScaleTargetMarker,deploymentReplicasSyncer
- --feature-gates=PropagationPolicyPreemption=true,MultiClusterService=true
- --feature-gates=PropagationPolicyPreemption=true,MultiClusterService=true,StatefulFailoverInjection=true
- --health-probe-bind-address=0.0.0.0:10357
- --v=4
livenessProbe:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,9 @@ spec:
This is particularly useful for stateful applications where maintaining data
consistency across failover events is crucial.
If not specified, means no state data will be preserved.

Note: This requires the StatefulFailoverInjection feature gate to be enabled,
which is alpha.
properties:
rules:
description: |-
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,9 @@ spec:
This is particularly useful for stateful applications where maintaining data
consistency across failover events is crucial.
If not specified, means no state data will be preserved.

Note: This requires the StatefulFailoverInjection feature gate to be enabled,
which is alpha.
properties:
rules:
description: |-
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -339,6 +339,9 @@ spec:
This is particularly useful for stateful applications where maintaining data
consistency across failover events is crucial.
If not specified, means no state data will be preserved.

Note: This requires the StatefulFailoverInjection feature gate to be enabled,
which is alpha.
properties:
rules:
description: |-
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -339,6 +339,9 @@ spec:
This is particularly useful for stateful applications where maintaining data
consistency across failover events is crucial.
If not specified, means no state data will be preserved.

Note: This requires the StatefulFailoverInjection feature gate to be enabled,
which is alpha.
properties:
rules:
description: |-
Expand Down
3 changes: 3 additions & 0 deletions pkg/apis/policy/v1alpha1/propagation_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -336,6 +336,9 @@ type ApplicationFailoverBehavior struct {
// This is particularly useful for stateful applications where maintaining data
// consistency across failover events is crucial.
// If not specified, means no state data will be preserved.
//
// Note: This requires the StatefulFailoverInjection feature gate to be enabled,
// which is alpha.
// +optional
StatePreservation *StatePreservation `json:"statePreservation,omitempty"`
}
Expand Down
26 changes: 14 additions & 12 deletions pkg/controllers/applicationfailover/common.go
Original file line number Diff line number Diff line change
Expand Up @@ -192,18 +192,20 @@ func buildTaskOptions(failoverBehavior *policyv1alpha1.ApplicationFailoverBehavi
taskOpts = append(taskOpts, workv1alpha2.WithReason(workv1alpha2.EvictionReasonApplicationFailure))
taskOpts = append(taskOpts, workv1alpha2.WithPurgeMode(failoverBehavior.PurgeMode))

if failoverBehavior.StatePreservation != nil && len(failoverBehavior.StatePreservation.Rules) != 0 {
targetStatusItem, exist := findTargetStatusItemByCluster(aggregatedStatus, cluster)
if !exist || targetStatusItem.Status == nil || targetStatusItem.Status.Raw == nil {
return nil, fmt.Errorf("the application status has not yet been collected from Cluster(%s)", cluster)
}
preservedLabelState, err := buildPreservedLabelState(failoverBehavior.StatePreservation, targetStatusItem.Status.Raw)
if err != nil {
return nil, err
}
if preservedLabelState != nil {
taskOpts = append(taskOpts, workv1alpha2.WithPreservedLabelState(preservedLabelState))
taskOpts = append(taskOpts, workv1alpha2.WithClustersBeforeFailover(clustersBeforeFailover))
if features.FeatureGate.Enabled(features.StatefulFailoverInjection) {
if failoverBehavior.StatePreservation != nil && len(failoverBehavior.StatePreservation.Rules) != 0 {
targetStatusItem, exist := findTargetStatusItemByCluster(aggregatedStatus, cluster)
if !exist || targetStatusItem.Status == nil || targetStatusItem.Status.Raw == nil {
return nil, fmt.Errorf("the application status has not yet been collected from Cluster(%s)", cluster)
}
preservedLabelState, err := buildPreservedLabelState(failoverBehavior.StatePreservation, targetStatusItem.Status.Raw)
if err != nil {
return nil, err
}
if preservedLabelState != nil {
taskOpts = append(taskOpts, workv1alpha2.WithPreservedLabelState(preservedLabelState))
taskOpts = append(taskOpts, workv1alpha2.WithClustersBeforeFailover(clustersBeforeFailover))
}
}
}

Expand Down
3 changes: 3 additions & 0 deletions pkg/controllers/applicationfailover/common_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ import (

policyv1alpha1 "github.com/karmada-io/karmada/pkg/apis/policy/v1alpha1"
workv1alpha2 "github.com/karmada-io/karmada/pkg/apis/work/v1alpha2"
"github.com/karmada-io/karmada/pkg/features"
)

func TestTimeStampProcess(t *testing.T) {
Expand Down Expand Up @@ -645,6 +646,8 @@ func Test_buildTaskOptions(t *testing.T) {
},
}
for _, tt := range tests {
err := features.FeatureGate.Set(fmt.Sprintf("%s=%t", features.StatefulFailoverInjection, true))
assert.NoError(t, err)
t.Run(tt.name, func(t *testing.T) {
got, err := buildTaskOptions(tt.args.failoverBehavior, tt.args.aggregatedStatus, tt.args.cluster, tt.args.producer, tt.args.clustersBeforeFailover)
if !tt.wantErr(t, err, fmt.Sprintf("buildTaskOptions(%v, %v, %v, %v, %v)", tt.args.failoverBehavior, tt.args.aggregatedStatus, tt.args.cluster, tt.args.producer, tt.args.clustersBeforeFailover)) {
Expand Down
9 changes: 6 additions & 3 deletions pkg/controllers/binding/common.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ import (
configv1alpha1 "github.com/karmada-io/karmada/pkg/apis/config/v1alpha1"
policyv1alpha1 "github.com/karmada-io/karmada/pkg/apis/policy/v1alpha1"
workv1alpha2 "github.com/karmada-io/karmada/pkg/apis/work/v1alpha2"
"github.com/karmada-io/karmada/pkg/features"
"github.com/karmada-io/karmada/pkg/resourceinterpreter"
"github.com/karmada-io/karmada/pkg/util"
"github.com/karmada-io/karmada/pkg/util/helper"
Expand Down Expand Up @@ -113,9 +114,11 @@ func ensureWork(
return err
}

// we need to figure out if the targetCluster is in the cluster we are going to migrate application to.
// If yes, we have to inject the preserved label state to clonedWorkload with the label.
clonedWorkload = injectReservedLabelState(bindingSpec, targetCluster, clonedWorkload, len(targetClusters))
if features.FeatureGate.Enabled(features.StatefulFailoverInjection) {
// we need to figure out if the targetCluster is in the cluster we are going to migrate application to.
// If yes, we have to inject the preserved label state to the clonedWorkload.
clonedWorkload = injectReservedLabelState(bindingSpec, targetCluster, clonedWorkload, len(targetClusters))
}

workMeta := metav1.ObjectMeta{
Name: names.GenerateWorkName(clonedWorkload.GetKind(), clonedWorkload.GetName(), clonedWorkload.GetNamespace()),
Expand Down
10 changes: 10 additions & 0 deletions pkg/features/features.go
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,15 @@ const (

// ResourceQuotaEstimate indicates if enable resource quota check in estimator
ResourceQuotaEstimate featuregate.Feature = "ResourceQuotaEstimate"

// StatefulFailoverInjection controls whether Karmada collects state information
// from the source cluster during a failover event for stateful applications and
// injects this information into the application configuration when it is moved
// to the target cluster.
//
// owner: @mszacillo, @XiShanYongYe-Chang
// alpha: v1.12
StatefulFailoverInjection featuregate.Feature = "StatefulFailoverInjection"
)

var (
Expand All @@ -58,6 +67,7 @@ var (
PolicyPreemption: {Default: false, PreRelease: featuregate.Alpha},
MultiClusterService: {Default: false, PreRelease: featuregate.Alpha},
ResourceQuotaEstimate: {Default: false, PreRelease: featuregate.Alpha},
StatefulFailoverInjection: {Default: false, PreRelease: featuregate.Alpha},
}
)

Expand Down
2 changes: 1 addition & 1 deletion pkg/generated/openapi/zz_generated.openapi.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.