Skip to content

Commit

Permalink
Merge branch 'master' into pod-index-label
Browse files Browse the repository at this point in the history
  • Loading branch information
cr7258 authored Jul 22, 2024
2 parents 330a7dd + c5c6df7 commit c9158e6
Show file tree
Hide file tree
Showing 12 changed files with 469 additions and 8 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
[![License](https://img.shields.io/badge/license-Apache%202-4EB1BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html)
[![Go Report Card](https://goreportcard.com/badge/github.com/openkruise/kruise)](https://goreportcard.com/report/github.com/openkruise/kruise)
[![CII Best Practices](https://bestpractices.coreinfrastructure.org/projects/2908/badge)](https://bestpractices.coreinfrastructure.org/en/projects/2908)
[![OpenSSF Scorecard](https://api.securityscorecards.dev/projects/github.com/openkruise/kruise/badge)](https://api.securityscorecards.dev/projects/github.com/openkruise/kruise)
[![OpenSSF Scorecard](https://api.scorecard.dev/projects/github.com/openkruise/kruise/badge)](https://scorecard.dev/viewer/?uri=github.com/openkruise/kruise)
[![CircleCI](https://circleci.com/gh/openkruise/kruise.svg?style=svg)](https://circleci.com/gh/openkruise/kruise)
[![codecov](https://codecov.io/gh/openkruise/kruise/branch/master/graph/badge.svg)](https://codecov.io/gh/openkruise/kruise)
[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-v2.0%20adopted-ff69b4.svg)](./CODE_OF_CONDUCT.md)
Expand Down
3 changes: 2 additions & 1 deletion apis/apps/v1alpha1/daemonset_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,12 @@ limitations under the License.
package v1alpha1

import (
appspub "github.com/openkruise/kruise/apis/apps/pub"
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/util/intstr"

appspub "github.com/openkruise/kruise/apis/apps/pub"
)

// DaemonSetUpdateStrategy is a struct used to control the update strategy for a DaemonSet.
Expand Down
2 changes: 1 addition & 1 deletion config/manager/manager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ spec:
- --enable-leader-election
- --logtostderr=true
- --v=5
- --feature-gates=AllAlpha=true
- --feature-gates=AllAlpha=true,EnableExternalCerts=false
image: controller:latest
imagePullPolicy: Always
securityContext:
Expand Down
161 changes: 161 additions & 0 deletions docs/proposals/20240309-cloneset-support-progressDeadlineSeconds.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
---
title: CloneSet
authors:
- "@hantmac"
reviewers:
- "@Fei-Guo"
- "@furykerry"
- "@FillZpp"
creation-date: 2024-03-10
last-updated: 2024-03-10
status: implementable
---

# Support progressDeadlineSeconds in CloneSet
Table of Contents
=================

- [Support progressDeadlineSeconds in CloneSet](#Support progressDeadlineSeconds in CloneSet)
- [Table of Contents](#table-of-contents)
- [Motivation](#motivation)
- [Proposal](#proposal)
- [1. add .spec.progressDeadlineSeconds field](#1add-.spec.progressDeadlineSeconds-field)
- [2. The behavior of progressDeadlineSeconds](#2the-behavior-of-progressDeadlineSeconds)
- [3. handle the logic](#2handle-the-logic)

## Motivation

`.spec.progressDeadlineSeconds` is an optional field in Deployment that specifies the number of seconds one wants to wait for their Deployment to progress before the system reports back that the Deployment has failed progressing.
Once the deadline has been exceeded, the Deployment controller adds a DeploymentCondition with the following attributes to the Deployment's `.status.conditions`:
```
type: Progressing
status: "False"
reason: ProgressDeadlineExceeded
```

This is useful for users to control the progress of the deployment.
So we should add support for `progressDeadlineSeconds` in CloneSet as well.

## Proposal
Firstly, add the `progressDeadlineSeconds` field to the CloneSetSpec.
Then add the handle logic in cloneSet controller to handle the `progressDeadlineSeconds` field.

### 1. add .spec.progressDeadlineSeconds field
The default value of `progressDeadlineSeconds` is 600 seconds according to the [official document](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#progress-deadline-seconds).
```yaml
apiVersion: apps.kruise.io/v1alpha1
kind: CloneSet
metadata:
name: cloneset-example
spec:
replicas: 3
progressDeadlineSeconds: 600
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
```
### 2. The behavior of progressDeadlineSeconds
However, the behavior of `progressDeadlineSeconds` in CloneSet might differ from its behavior in Deployment due to the support of partition in CloneSet. If a CloneSet is paused due to partition, it's debatable whether the paused time should be included in the progress deadline.
Here are two possible interpretations of `progressDeadlineSeconds` in the context of CloneSet:
1. `progressDeadlineSeconds` could be redefined as the time taken for the CloneSet to reach completion or the "paused" state due to partition. In this case, the time during which the CloneSet is paused would NOT be included in the progress deadline.
2. Secondly, `progressDeadlineSeconds` could only be supported if the partition is not set. This means that if a partition is set, the `progressDeadlineSeconds` field would not be applicable or has no effect.

After the discussion of the community, we choose the first interpretation.So we should re-set the progressDeadlineSeconds when the CloneSet reach completion OR "the paused state".

### 3. handle the logic
In cloneset controller, we should add the logic to handle the `progressDeadlineSeconds` field. We firstly add a timer to check the progress of the CloneSet.
If the progress exceeds the `progressDeadlineSeconds`, we should add a CloneSetCondition to the CloneSet's `.status.conditions`:
```go
// add a timer to check the progress of the CloneSet
if cloneSet.Spec.ProgressDeadlineSeconds != nil {
// handle the logic
starttime := time.Now()
...
if time.Now().After(starttime.Add(time.Duration(*cloneSet.Spec.ProgressDeadlineSeconds) * time.Second)) {
newStatus.Conditions = append(newStatus.Conditions, appsv1alpha1.CloneSetCondition{
Type: appsv1alpha1.CloneSetProgressing,
Status: corev1.ConditionFalse,
Reason: appsv1alpha1.CloneSetProgressDeadlineExceeded,
})
}
}
```

When the CloneSet reaches the "paused" state, we should reset the timer to avoid the progress deadline being exceeded.
And we check the progress of the CloneSet in the `syncCloneSetStatus` function. If the progress exceeds the `progressDeadlineSeconds`, we should add a CloneSetCondition to the CloneSet's `.status.conditions`:

```go
const (
CloneSetProgressDeadlineExceeded CloneSetConditionReason = "ProgressDeadlineExceeded"
CloneSetConditionTypeProgressing CloneSetConditionType = "Progressing"
)
```

```go
func (c *CloneSetController) syncCloneSetStatus(cloneSet *appsv1alpha1.CloneSet, newStatus *appsv1alpha1.CloneSetStatus) error {
...
if cloneSet.Spec.ProgressDeadlineSeconds != nil {
// handle the logic
if time.Now().After(starttime.Add(time.Duration(*cloneSet.Spec.ProgressDeadlineSeconds) * time.Second)) {
newStatus.Conditions = append(newStatus.Conditions, appsv1alpha1.CloneSetCondition{
Type: appsv1alpha1.CloneSetProgressing,
Status: corev1.ConditionFalse,
Reason: appsv1alpha1.CloneSetProgressDeadlineExceeded,
})
}
}
...
}
```

When the CloneSet reaches the "paused" state, we should reset the timer to avoid the progress deadline being exceeded.
```go
func (c *CloneSetController) syncCloneSetStatus(cloneSet *appsv1alpha1.CloneSet, newStatus *appsv1alpha1.CloneSetStatus) error {
...
// reset the starttime when the CloneSet reaches the "paused" state or complete state
if cloneSet.Status.UpdatedReadyReplicas == cloneSet.Status.Replicas || replicas - updatedReplicas = partition {
starttime = time.Now()
}
if cloneSet.Spec.Paused {
starttime = time.Now()
}
...
}
...
}
```

And we can save the starttime in the `LastUpdateTime` in the CloneSet's `.status.conditions`:
```
status:
conditions:
- lastTransitionTime: "2021-11-26T20:52:12Z"
lastUpdateTime: "2021-11-26T20:52:12Z"
message: CloneSet has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
- lastTransitionTime: "2021-11-26T20:52:12Z"
lastUpdateTime: "2021-11-26T20:52:12Z"
message: 'progress deadline exceeded'
reason: ProgressDeadlineExceeded
status: "False"
type: Progressing
```

## Implementation History

- [ ] 06/07/2024: Proposal submission


4 changes: 4 additions & 0 deletions pkg/features/kruise_features.go
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,9 @@ const (

// Set pod completion index as a pod label for Indexed Jobs.
PodIndexLabel featuregate.Feature = "PodIndexLabel"

// Use certs generated externally
EnableExternalCerts featuregate.Feature = "EnableExternalCerts"
)

var defaultFeatureGates = map[featuregate.Feature]featuregate.FeatureSpec{
Expand Down Expand Up @@ -158,6 +161,7 @@ var defaultFeatureGates = map[featuregate.Feature]featuregate.FeatureSpec{
RecreatePodWhenChangeVCTInCloneSetGate: {Default: false, PreRelease: featuregate.Alpha},
StatefulSetStartOrdinal: {Default: false, PreRelease: featuregate.Alpha},
PodIndexLabel: {Default: true, PreRelease: featuregate.Beta},
EnableExternalCerts: {Default: false, PreRelease: featuregate.Alpha},
}

func init() {
Expand Down
20 changes: 20 additions & 0 deletions pkg/webhook/util/configuration/configuration.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ import (
clientset "k8s.io/client-go/kubernetes"
"k8s.io/klog/v2"

"github.com/openkruise/kruise/pkg/features"
utilfeature "github.com/openkruise/kruise/pkg/util/feature"
"github.com/openkruise/kruise/pkg/webhook/types"
webhookutil "github.com/openkruise/kruise/pkg/webhook/util"
)
Expand All @@ -46,6 +48,24 @@ func Ensure(kubeClient clientset.Interface, handlers map[string]types.HandlerGet
if err != nil {
return fmt.Errorf("not found ValidatingWebhookConfiguration %s", validatingWebhookConfigurationName)
}

if utilfeature.DefaultFeatureGate.Enabled(features.EnableExternalCerts) {
// if using external certs, only check the caBundle of webhook
for _, wh := range mutatingConfig.Webhooks {
if wh.ClientConfig.CABundle == nil {
return fmt.Errorf("caBundle of MutatingWebhookConfiguration %s is nil", mutatingWebhookConfigurationName)

}
}

for _, wh := range validatingConfig.Webhooks {
if wh.ClientConfig.CABundle == nil {
return fmt.Errorf("caBundle of ValidatingWebhookConfiguration %s is nil", mutatingWebhookConfigurationName)
}
}
return nil
}
// if using certs generated by kruise, update webhook configurations
oldMutatingConfig := mutatingConfig.DeepCopy()
oldValidatingConfig := validatingConfig.DeepCopy()

Expand Down
9 changes: 7 additions & 2 deletions pkg/webhook/util/controller/webhook_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@ import (
"k8s.io/klog/v2"

extclient "github.com/openkruise/kruise/pkg/client"
"github.com/openkruise/kruise/pkg/features"
utilfeature "github.com/openkruise/kruise/pkg/util/feature"
webhooktypes "github.com/openkruise/kruise/pkg/webhook/types"
webhookutil "github.com/openkruise/kruise/pkg/webhook/util"
"github.com/openkruise/kruise/pkg/webhook/util/configuration"
Expand Down Expand Up @@ -233,7 +235,11 @@ func (c *Controller) sync() error {
var err error

certWriterType := webhookutil.GetCertWriter()
if certWriterType == writer.FsCertWriter || (len(certWriterType) == 0 && len(webhookutil.GetHost()) != 0) {
if utilfeature.DefaultFeatureGate.Enabled(features.EnableExternalCerts) {
certWriter, err = writer.NewExternalCertWriter(writer.ExternalCertWriterOptions{
Clientset: c.kubeClient,
})
} else if certWriterType == writer.FsCertWriter || (len(certWriterType) == 0 && len(webhookutil.GetHost()) != 0) {
certWriter, err = writer.NewFSCertWriter(writer.FSCertWriterOptions{
Path: webhookutil.GetCertDir(),
})
Expand All @@ -254,7 +260,6 @@ func (c *Controller) sync() error {
if err := writer.WriteCertsToDir(webhookutil.GetCertDir(), certs); err != nil {
return fmt.Errorf("failed to write certs to dir: %v", err)
}

if err := configuration.Ensure(c.kubeClient, c.handlers, certs.CACert); err != nil {
return fmt.Errorf("failed to ensure configuration: %v", err)
}
Expand Down
18 changes: 18 additions & 0 deletions pkg/webhook/util/crd/crd.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ import (
"k8s.io/apimachinery/pkg/runtime/schema"

"github.com/openkruise/kruise/apis"
"github.com/openkruise/kruise/pkg/features"
utilfeature "github.com/openkruise/kruise/pkg/util/feature"
webhookutil "github.com/openkruise/kruise/pkg/webhook/util"
)

Expand All @@ -49,6 +51,21 @@ func Ensure(client apiextensionsclientset.Interface, lister apiextensionslisters
return fmt.Errorf("failed to list crds: %v", err)
}

if utilfeature.DefaultFeatureGate.Enabled(features.EnableExternalCerts) {
for _, crd := range crdList {
if len(crd.Spec.Versions) == 0 || crd.Spec.Conversion == nil || crd.Spec.Conversion.Strategy != apiextensionsv1.WebhookConverter {
continue
}
if !kruiseScheme.Recognizes(schema.GroupVersionKind{Group: crd.Spec.Group, Version: crd.Spec.Versions[0].Name, Kind: crd.Spec.Names.Kind}) {
continue
}

if crd.Spec.Conversion.Webhook == nil || crd.Spec.Conversion.Webhook.ClientConfig == nil || crd.Spec.Conversion.Webhook.ClientConfig.CABundle == nil {
return fmt.Errorf("bad conversion configuration of CRD %s", crd.Name)
}
}
return nil
}
webhookConfig := apiextensionsv1.WebhookClientConfig{
CABundle: caBundle,
}
Expand Down Expand Up @@ -85,5 +102,6 @@ func Ensure(client apiextensionsclientset.Interface, lister apiextensionslisters
}
}
}

return nil
}
34 changes: 34 additions & 0 deletions pkg/webhook/util/util.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ package util
import (
"os"
"strconv"
"time"

"k8s.io/klog/v2"

Expand Down Expand Up @@ -69,3 +70,36 @@ func GetCertDir() string {
func GetCertWriter() string {
return os.Getenv("WEBHOOK_CERT_WRITER")
}

var (
renewBefore time.Duration
)

func GetRenewBeforeTime() time.Duration {
if renewBefore != 0 {
return renewBefore
}
renewBefore = 6 * 30 * 24 * time.Hour
if s := os.Getenv("CERTS_RENEW_BEFORE"); len(s) > 0 {
t, err := strconv.Atoi(s[0 : len(s)-1])
if err != nil {
klog.Errorf("failed to parese time %s: %v", s[0:len(s)-1], err)
return renewBefore
}
suffix := s[len(s)-1]
if suffix == 'd' {
renewBefore = time.Duration(t) * 7 * time.Hour
} else if suffix == 'm' {
renewBefore = time.Duration(t) * 30 * time.Hour
} else if suffix == 'y' {
renewBefore = time.Duration(t) * 365 * time.Hour
} else {
klog.Errorf("unknown date suffix %c", suffix)
}
}
if renewBefore <= 0 {
klog.Error("renewBefore time can not be less or equal than 0")
renewBefore = 6 * 30 * 24 * time.Hour
}
return renewBefore
}
7 changes: 4 additions & 3 deletions pkg/webhook/util/writer/certwriter.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ import (

"k8s.io/klog/v2"

"github.com/openkruise/kruise/pkg/webhook/util"
"github.com/openkruise/kruise/pkg/webhook/util/generator"
)

Expand Down Expand Up @@ -61,7 +62,8 @@ func handleCommon(dnsName string, ch certReadWriter) (*generator.Artifacts, bool
}

// Recreate the cert if it's invalid.
valid := validCert(certs, dnsName)
renewBefore := util.GetRenewBeforeTime()
valid := validCert(certs, dnsName, time.Now().Add(renewBefore))
if !valid {
klog.Info("cert is invalid or expired, regenerating a new one")
certs, err = ch.overwrite(certs.ResourceVersion)
Expand Down Expand Up @@ -98,10 +100,9 @@ type certReadWriter interface {
overwrite(resourceVersion string) (*generator.Artifacts, error)
}

func validCert(certs *generator.Artifacts, dnsName string) bool {
func validCert(certs *generator.Artifacts, dnsName string, expired time.Time) bool {
if certs == nil {
return false
}
expired := time.Now().AddDate(0, 6, 0)
return generator.ValidCACert(certs.Key, certs.Cert, certs.CACert, dnsName, expired)
}
Loading

0 comments on commit c9158e6

Please sign in to comment.