Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update stateful_set_utils.go #1136

Merged
merged 5 commits into from
Jan 3, 2023
Merged

Update stateful_set_utils.go #1136

merged 5 commits into from
Jan 3, 2023

Conversation

ivanszl
Copy link
Contributor

@ivanszl ivanszl commented Nov 29, 2022

Signed-off-by: ivan.lan ivan.lin.1985@gmail.com

Ⅰ. Describe what this PR does

设置sts的spec.podManagementPolicy = Parallel
spec.scaleStrategy.maxUnavailable = 100%
由于修改sts的模版,导致Pod启动不了,状态一直处于pending,后续重新修改会卡住
重新修改后会作用于所有的Pod,可以快速修复业务服务

@kruise-bot
Copy link

Welcome @ivanszl! It looks like this is your first PR to openkruise/kruise 🎉

@kruise-bot kruise-bot added the size/XS size/XS: 0-9 label Nov 29, 2022
@ivanszl
Copy link
Contributor Author

ivanszl commented Nov 29, 2022

/assign @FillZpp

@zmberg
Copy link
Member

zmberg commented Nov 30, 2022

@ivanszl Can you describe the process in detail? I did not reproduce your problem. In addition, I re-reviewed the code and feel that the current val <= 0 judgment logic should meet expectations.

@zmberg zmberg assigned zmberg and unassigned FillZpp Nov 30, 2022
@ivanszl
Copy link
Contributor Author

ivanszl commented Nov 30, 2022

@ivanszl Can you describe the process in detail? I did not reproduce your problem. In addition, I re-reviewed the code and feel that the current val <= 0 judgment logic should meet expectations.

hi, this is a example

kind: StatefulSet
apiVersion: apps.kruise.io/v1beta1
metadata:
  name: busybox-test
  namespace: default
  labels:
    app: busybox-test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: busybox-test
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: busybox-test
    spec:
      containers:
        - name: busybox
          image: busybox:latest
          resources: {}
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: IfNotPresent
      restartPolicy: Always
      terminationGracePeriodSeconds: 30
      dnsPolicy: ClusterFirst
      securityContext: {}
      schedulerName: default-scheduler
  podManagementPolicy: Parallel
  scaleStrategy:
    maxUnavailable: 100%
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      partition: 0
      maxUnavailable: 100%
  revisionHistoryLimit: 10
$ kubectl get pods busybox-test-0
NAME             READY   STATUS             RESTARTS      AGE
busybox-test-0   0/1     CrashLoopBackOff   5 (95s ago)   4m45s

modify yaml and apply it

kind: StatefulSet
apiVersion: apps.kruise.io/v1beta1
metadata:
  name: busybox-test
  namespace: default
  labels:
    app: busybox-test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: busybox-test
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: busybox-test
        test-label: 'yes'
    spec:
      containers:
        - name: busybox
          image: busybox:latest
          resources: {}
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: IfNotPresent
      restartPolicy: Always
      terminationGracePeriodSeconds: 30
      dnsPolicy: ClusterFirst
      securityContext: {}
      schedulerName: default-scheduler
  podManagementPolicy: Parallel
  scaleStrategy:
    maxUnavailable: 100%
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      partition: 0
      maxUnavailable: 100%
  revisionHistoryLimit: 10

and then look the pod

kubectl describe pods busybox-test-0
Name:         busybox-test-0
Namespace:    default
Priority:     0
Node:         minikube/192.168.49.2
Start Time:   Wed, 30 Nov 2022 15:16:28 +0800
Labels:       app=busybox-test
              controller-revision-hash=busybox-test-f89f864cc
              lifecycle.apps.kruise.io/state=Normal
              statefulset.kubernetes.io/pod-name=busybox-test-0
Annotations:  lifecycle.apps.kruise.io/timestamp: 2022-11-30T07:16:28Z
Status:       Running
IP:           172.17.0.6
IPs:
  IP:           172.17.0.6
Controlled By:  StatefulSet/busybox-test
Containers:
  busybox:
    Container ID:   docker://a1227f31ef27e0d4fbf61aa29f585da912ce5dbe2fcc662a1e37a2ca3b4329b0
    Image:          busybox:latest
    Image ID:       docker-pullable://busybox@sha256:fcd85228d7a25feb59f101ac3a955d27c80df4ad824d65f5757a954831450185
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 30 Nov 2022 15:22:20 +0800
      Finished:     Wed, 30 Nov 2022 15:22:20 +0800
    Ready:          False
    Restart Count:  6
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-bml2r (ro)
Readiness Gates:
  Type             Status
  KruisePodReady   True
Conditions:
  Type              Status
  KruisePodReady    True
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  kube-api-access-bml2r:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  6m25s                  default-scheduler  Successfully assigned default/busybox-test-0 to minikube
  Normal   Pulling    6m24s                  kubelet            Pulling image "busybox:latest"
  Normal   Pulled     6m17s                  kubelet            Successfully pulled image "busybox:latest" in 6.646627307s
  Normal   Created    4m41s (x5 over 6m17s)  kubelet            Created container busybox
  Normal   Started    4m41s (x5 over 6m17s)  kubelet            Started container busybox
  Normal   Pulled     4m41s (x4 over 6m16s)  kubelet            Container image "busybox:latest" already present on machine
  Warning  BackOff    83s (x26 over 6m15s)   kubelet            Back-off restarting failed container

we find the pod was't modified

@zmberg
Copy link
Member

zmberg commented Dec 2, 2022

@ivanszl I don't think it can be modified directly, because it will destroy the semantics of scaleStrategy.

The core of this problem is the corner case for 100% maxUnavailable, so i think it can be modified as follows:

if set.Spec.ScaleStrategy != nil && set.Spec.ScaleStrategy.MaxUnavailable != nil {
    ...
    scaleMaxUnavailable = &maxUnavailable
    // add the code, to solve the 100% maxUnavailable corner case. 
    if int32(*scaleMaxUnavailable) >= *set.Spec.Replicas {
	    scaleMaxUnavailable = utilpointer.IntPtr(math.MaxInt32)
    }
}

@ivanszl
Copy link
Contributor Author

ivanszl commented Dec 2, 2022

@zmberg If it was modified like you code, it can't cover same case.
If you are set maxUnavailable < set.Spec.Replicas, you can find the updated pods not equal maxUnavailable, it's always equal maxUnavailable - 1.

@zmberg
Copy link
Member

zmberg commented Dec 2, 2022

@ivanszl You can look at this document https://openkruise.io/docs/user-manuals/advancedstatefulset/#scaling-with-rate-limiting. Your modification method will break the semantics of the document

@ivanszl
Copy link
Contributor Author

ivanszl commented Dec 2, 2022

@zmberg Ok, I see.
It will break the semantics.

@zmberg
Copy link
Member

zmberg commented Dec 2, 2022

@ivanszl We also discussed it internally, and felt that scaleStrategy should only affect Create Pod, not Update Pod, so i think there are three places that need to be modified, for example:

// if the set does not allow bursting, return immediately
if monotonic {
      return &status, nil
} else if decreaseAndCheckMaxUnavailable(scaleMaxUnavailable) {
     break
}

@ivanszl
Copy link
Contributor Author

ivanszl commented Dec 2, 2022

@ivanszl We also discussed it internally, and felt that scaleStrategy should only affect Create Pod, not Update Pod, so i think there are three places that need to be modified, for example:

// if the set does not allow bursting, return immediately
if monotonic {
      return &status, nil
} else if decreaseAndCheckMaxUnavailable(scaleMaxUnavailable) {
     break
}

@zmberg
I think this places that need to modified may be better, modified as follow

// if we find no more Pods can be created, no more work can be done this round
if decreaseAndCheckMaxUnavailable(scaleMaxUnavailable) {
     break
}
....
// if the set does not allow bursting, return immediately
if monotonic {
    return &status, nil
}

@zmberg
Copy link
Member

zmberg commented Dec 2, 2022

The monotonic parameter is still more preferred, and the scale Strategy is only considered when it is false.

@kruise-bot kruise-bot added size/S size/S 10-29 and removed size/XS size/XS: 0-9 labels Dec 2, 2022
@codecov-commenter
Copy link

codecov-commenter commented Dec 2, 2022

Codecov Report

Merging #1136 (6881e3f) into master (0cfd676) will increase coverage by 0.17%.
The diff coverage is 46.15%.

@@            Coverage Diff             @@
##           master    #1136      +/-   ##
==========================================
+ Coverage   49.43%   49.60%   +0.17%     
==========================================
  Files         137      137              
  Lines       19326    19423      +97     
==========================================
+ Hits         9553     9635      +82     
- Misses       8721     8735      +14     
- Partials     1052     1053       +1     
Flag Coverage Δ
unittests 49.60% <46.15%> (+0.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/controller/statefulset/stateful_set_utils.go 81.91% <ø> (ø)
pkg/controller/statefulset/stateful_set_control.go 63.59% <46.15%> (-0.32%) ⬇️
pkg/webhook/pod/mutating/persistent_pod_state.go 64.21% <0.00%> (-1.73%) ⬇️
...sistentpodstate/persistent_pod_state_controller.go 31.18% <0.00%> (-1.68%) ⬇️
...tentpodstate/persistent_pod_state_event_handler.go 0.00% <0.00%> (ø)
pkg/controller/uniteddeployment/revision.go 67.02% <0.00%> (+1.06%) ⬆️
...ate/validating/persistent_create_update_handler.go 59.77% <0.00%> (+1.23%) ⬆️
pkg/controller/cloneset/cloneset_controller.go 55.18% <0.00%> (+2.24%) ⬆️
pkg/webhook/cloneset/validating/validation.go 67.33% <0.00%> (+7.60%) ⬆️
... and 1 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ivanszl
Copy link
Contributor Author

ivanszl commented Dec 4, 2022

@zmberg
I find other call decreaseAndCheckMaxUnavailable function places, it was decrease and check before to do, but in the create block was decrease and check after created, so i think this modified may be preferred.

@zmberg
Copy link
Member

zmberg commented Dec 5, 2022

@zmberg I find other call decreaseAndCheckMaxUnavailable function places, it was decrease and check before to do, but in the create block was decrease and check after created, so i think this modified may be preferred.

@ivanszl Relevant places need to be modified. The above is just an example, not all of them are listed.

@@ -687,5 +687,5 @@ func decreaseAndCheckMaxUnavailable(maxUnavailable *int) bool {
}
val := *maxUnavailable - 1
*maxUnavailable = val
return val <= 0
return val < 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be no further adjustments here, and the original writing method should be maintained.

@@ -542,7 +547,7 @@ func (ssc *defaultStatefulSetControl) updateStatefulSet(
status.UpdatedReplicas++
}
// if the set does not allow bursting, return immediately
if monotonic || decreaseAndCheckMaxUnavailable(scaleMaxUnavailable) {
if monotonic {
return &status, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if monotonic {
    return &status, nil
} else if decreaseAndCheckMaxUnavailable(scaleMaxUnavailable) {
    break
}

if isTerminating(replicas[i]) && monotonic {
	klog.V(4).Infof(
		"StatefulSet %s/%s is waiting for Pod %s to Terminate",
		set.Namespace,
		set.Name,
		replicas[i].Name)
	return &status, nil
} else if isTerminating(replicas[i]) && decreaseAndCheckMaxUnavailable(scaleMaxUnavailable) {
     break
}

if monotonic || scaleMaxUnavailable != nil {
	isAvailable, waitTime := isRunningAndAvailable(replicas[i], minReadySeconds)
	if !isAvailable && monotonic {
		if waitTime > 0 {
			// make sure we check later
			durationStore.Push(getStatefulSetKey(set), waitTime)
			klog.V(4).Infof(
				"StatefulSet %s/%s needs to wait %s for the Pod %s to be Running and Available after being"+
					" Ready for %d seconds",
				set.Namespace,
				set.Name,
				waitTime,
				replicas[i].Name,
				minReadySeconds)
		} else {
			klog.V(4).Infof(
				"StatefulSet %s/%s is waiting for Pod %s to be Running and Ready",
				set.Namespace,
				set.Name,
				replicas[i].Name)
		}
		return &status, nil
	} else if !isAvailable && decreaseAndCheckMaxUnavailable(scaleMaxUnavailable) {
		break
	}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On 545 line, the original writing method should be maintained. It will be not destroy the semantics of scaleStrategy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is convenient for you, you can communicate on the zoom of the community meeting on 12.8.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

return &status, nil
} else if decreaseAndCheckMaxUnavailable(scaleMaxUnavailable) {
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

klog.V(4).Infof(
"StatefulSet %s/%s Pod %s is Creating, and break pods scale",
set.Namespace,
set.Name,
replicas[i].Name)

klog.V(4).Infof(
"StatefulSet %s/%s is waiting for Pod %s to Terminate",
set.Namespace,
set.Name,
replicas[i].Name)
return &status, nil
} else if isTerminating(replicas[i]) && decreaseAndCheckMaxUnavailable(scaleMaxUnavailable) {
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

klog.V(4).Infof(
"StatefulSet %s/%s Pod %s is Terminating, and break pods scale",
set.Namespace,
set.Name,
replicas[i].Name)

// make sure we check later
durationStore.Push(getStatefulSetKey(set), waitTime)
}
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

klog.V(4).Infof(
"StatefulSet %s/%s Pod %s is unavailable, and break pods scale",
set.Namespace,
set.Name,
replicas[i].Name)

@zmberg
Copy link
Member

zmberg commented Dec 13, 2022

please git commit -s, and git push again.

@kruise-bot kruise-bot added size/L size/L: 100-499 and removed size/M size/M: 30-99 labels Dec 14, 2022
@zmberg
Copy link
Member

zmberg commented Dec 14, 2022

please git commit -s, and git push again.

@ivanszl you need resolve the issue https://github.com/openkruise/kruise/pull/1136/checks?check_run_id=10079268512, otherwise there is no way to merge the pr.

linsongzheng added 5 commits December 14, 2022 20:58
Fix all unavailable pod update new version fail

Signed-off-by: linsongzheng <linsongzheng@onething.net>
Signed-off-by: linsongzheng <linsongzheng@onething.net>
Signed-off-by: linsongzheng <linsongzheng@onething.net>
Signed-off-by: linsongzheng <linsongzheng@onething.net>
Signed-off-by: linsongzheng <linsongzheng@onething.net>
@zmberg
Copy link
Member

zmberg commented Dec 15, 2022

/lgtm

@furykerry
Copy link
Member

/approve

@kruise-bot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: furykerry

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kruise-bot kruise-bot merged commit 20b74ba into openkruise:master Jan 3, 2023
@zmberg zmberg added this to the v1.4 milestone Feb 14, 2023
@zmberg zmberg added the kind/bug Something isn't working label Feb 14, 2023
ppbits pushed a commit to ppbits/kruise that referenced this pull request Apr 4, 2024
* Update stateful_set_utils.go

Fix all unavailable pod update new version fail


* Fix will be create one more pod


* add scaleStrategy e2e test


* add log


* add log


Co-authored-by: linsongzheng <linsongzheng@onething.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved kind/bug Something isn't working lgtm size/L size/L: 100-499
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants