Change log since v1.2.0
Kubernetes provides three Pod lifecycle management:
- Readiness Probe Used to determine whether the business container is ready to respond to user requests. If the probe fails, the Pod will be removed from Service Endpoints.
- Liveness Probe Used to determine the health status of the container. If the probe fails, the kubelet will restart the container.
- Startup Probe Used to know when a container application has started. If such a probe is configured, it disables liveness and readiness checks until it succeeds.
So the Probe capabilities provided in Kubernetes have defined specific semantics and related behaviors. In addition, there is actually a need to customize Probe semantics and related behaviors, such as:
- GameServer defines Idle Probe to determine whether the Pod currently has a game match, if not, from the perspective of cost optimization, the Pod can be scaled down.
- K8S Operator defines the main-secondary probe to determine the role of the current Pod (main or secondary). When upgrading, the secondary can be upgraded first, so as to achieve the behavior of selecting the main only once during the upgrade process, reducing the service interruption time during the upgrade process.
So we provides the ability to customize the Probe and return the result to the Pod yaml.
For more detail, please refer to its documentation and proposal.
- SidecarSet support to inject pods under kube-system,kube-public namespace. (#1084, @zmberg)
- SidecarSet support to inject specific history sidecar container to Pods. (#1021, @veophi)
- SidecarSet support to inject pod annotations.(#992, @zmberg)
- CloneSet supports to calculate scale number excluding Pods in PreparingDelete. (#1024, @FillZpp)
- Optimize CloneSet queuing when cache has just synced. (#1026, @FillZpp)
- Allow optional filed max unavilable in ads, and set default value 1. (#1007, @ABNER-1)
- Fix DaemonSet surging with minReadySeconds. (#1014, @FillZpp)
- Optimize Advanced DaemonSet internal new pod for imitating scheduling. (#1011, @FillZpp)
- Advanced DaemonSet support pre-download image. (#1057, @ABNER-1)
- Optimize performance of LabelSelector conversion. (#1068, @FillZpp)
- Reduce kruise-manager memory allocation. (#1015, @FillZpp)
- Pod state from updating to Normal should all hooked. (#1022, @shiyan2016)
- Fix go get in Makefile with go 1.18. (#1036, @astraw99)
- Fix EphemeralJob spec.replicas nil panic bug. (#1016, @hellolijj)
- Fix UnitedDeployment reconcile don't return err bug. (#991, @huiwq1990)
Change log since v1.1.0
With the development of cloud native, more and more companies start to deploy stateful services (e.g., Etcd, MQ) using Kubernetes. K8S StatefulSet is a workload for managing stateful services, and it considers the deployment characteristics of stateful services in many aspects. However, StatefulSet persistent only limited pod state, such as Pod Name is ordered and unchanging, PVC persistence, and can not cover other states, e.g. Pod IP retention, priority scheduling to previously deployed Nodes.
So we provide PersistentPodState
CRD to persistent other states of the Pod, such as "IP Retention".
For more detail, please refer to its documentation and proposal.
- Ensure at least one pod is upgraded if CloneSet has
partition < 100%
(Behavior Change). (#954, @veophi) - Add
expectedUpdatedReplicas
field into CloneSet status. (#954 & #963, @veophi) - Add
markPodNotReady
field into lifecycle hook to support marking Pod as NotReady during preparingDelete or preparingUpdate. (#979, @veophi)
- Add
markPodNotReady
field into lifecycle hook to support marking Pod as NotReady during preparingDelete or preparingUpdate. (#979, @veophi)
- Support to protect any custom workloads with scale subresource. (#982, @zmberg)
- Optimize performance in large-scale clusters by avoiding DeepCopy list. (#955, @zmberg)
- Remove some commented code and simplify some. (#983, @hezhizhen)
- Sidecarset forbid updating of sidecar container name. (#937, @adairxie)
- Optimize the logic of listNamespacesForDistributor func. (#952, @hantmac)
Change log since v1.0.1
- Bump Kubernetes dependencies to 1.22 and controller-runtime to v0.10.2. (#915, @FillZpp)
- Disable DeepCopy for some specific cache list. (#916, @FillZpp)
- Support in-place update containers with launch priority, for workloads that supported in-place update, e.g., CloneSet, Advanced StatefulSet. (#909, @FillZpp)
- Add
pod-template-hash
label into Pods, which will always be the short hash. (#931, @FillZpp) - Support pre-download image after a number of updated pods has been ready. (#904, @shiyan2016)
- Make maxUnavailable also limited to pods in new revision. (#899, @FillZpp)
- Refactor daemonset controller and fetch upstream codebase. (#883, @FillZpp)
- Support preDelete lifecycle for both scale down and recreate update. (#923, @FillZpp)
- Fix node event handler that should compare update selector matching changed. (#920, @LastNight1997)
- Optimize
dedupCurHistories
func in ReconcileDaemonSet. (#912, @LastNight1997)
- Support shared volumes in init containers. (#929, @outgnaY)
- Support transferEnv in init containers. (#897, @pigletfly)
- Optimize the injection for pod webhook that checks container exists. (#927, @zmberg)
- Fix validateSidecarConflict to avoid a same sidecar container exists in multiple sidecarsets. (#884, @pigletfly)
- Support CRI-O and any other common CRI types. (#930, @diannaowa) & (#936, @FillZpp)
Change log since v1.0.0
- Fix panic when SidecarSet manages Pods with sidecar containers that have different update type. (#850, @veophi)
- Fix log error when extract container from fieldpath failed. (#860, @pigletfly)
- Optimization logic of determining whether the pod state is consistent logic. (#854, @dafu-wu)
- Replace reflect with generation in event handler. (#885, @zouyee)
- Store history revisions for sidecarset. (#715, @veophi)
- Allow updating asts RevisionHistoryLimit. (#864, @shiyan2016)
- StatefulSet considers non-available pods when deleting pods. (#880, @hzyfox)
- Break the loop when it finds the current revision. (#887, @shiyan2016)
- Remove duplicate register fieldindexes in cloneset controller. (#888 & #889, @shiyan2016)
- CloneSet refresh pod states before skipping update when paused (Behavior Change). (#893, @FillZpp)
Change log since v0.10.1
- Add SourceContainerNameFrom and EnvNames in sidecarset transferenv.
- Fix update expectation to be increased when a pod updated.
- Fix bug: read conditions from nil old subset status.
- Do not set timeout for webhook ready.
Change log since v0.10.1
- Bump CustomResourceDefinition(CRD) from v1beta1 to v1
- Bump ValidatingWebhookConfiguration/MutatingWebhookConfiguration from v1beta1 to v1
- Bump dependencies: k8s v1.18 -> v1.20, controller-runtime v0.6.5 -> v0.8.3
- Generate CRDs with original controller-tools and markers
So that Kruise can install into Kubernetes 1.22 and no longer support Kubernetes < 1.16.
When update spec.template.metadata.labels/annotations
in CloneSet or Advanced StatefulSet and there exists container env from the changed labels/annotations,
Kruise will in-place update them to renew the env value in containers.
Container Launch Priority provides a way to help users control the sequence of containers start in a Pod.
It works for Pod, no matter what kind of owner it belongs to, which means Deployment, CloneSet or any other Workloads are all supported.
For the scenario, where the namespace-scoped resources such as Secret and ConfigMap need to be distributed or synchronized to different namespaces, the native k8s currently only supports manual distribution and synchronization by users one-by-one, which is very inconvenient.
Therefore, in the face of these scenarios that require the resource distribution and continuously synchronization across namespaces, we provide a tool, namely ResourceDistribution, to do this automatically.
Currently, ResourceDistribution supports the two kind resources --- Secret & ConfigMap.
- Add
maxUnavailable
field inscaleStrategy
to support rate limiting of scaling up. - Mark revision stable as
currentRevision
when all pods updated to it, won't wait all pods to be ready (Behavior Change).
- Manage the pods that were created before WorkloadSpread.
- Optimize webhook update and retry during injection.
- Add pod no pub-protection annotation.
- PUB controller watch workload replicas changed.
- Support in-place update daemon pod.
- Support progressive annotation to control if pods creation should be limited by partition.
- Fix SidecarSet filter active pods.
- Fix pod NodeSelectorTerms length 0 when UnitedDeployment NodeSelectorTerms is nil.
- Add
--nodeimage-creation-delay
flag to delay NodeImage creation after Node ready.
- Kruise-daemon watch pods using protobuf.
- Export resync seconds args.
- Fix http checker reload ca.cert.
- Fix E2E for WorkloadSpread, ImagePulling, ContainerLaunchPriority.
Change log since v1.0.0-alpha.2
For the scenario, where the namespace-scoped resources such as Secret and ConfigMap need to be distributed or synchronized to different namespaces, the native k8s currently only supports manual distribution and synchronization by users one-by-one, which is very inconvenient.
Therefore, in the face of these scenarios that require the resource distribution and continuously synchronization across namespaces, we provide a tool, namely ResourceDistribution, to do this automatically.
Currently, ResourceDistribution supports the two kind resources --- Secret & ConfigMap.
- Add
maxUnavailable
field inscaleStrategy
to support rate limiting of scaling up. - Mark revision stable when all pods updated to it, won't wait all pods to be ready.
- Support progressive annotation to control if pods creation should be limited by partition.
- Fix pod NodeSelectorTerms length 0 when UnitedDeployment NodeSelectorTerms is nil.
Change log since v1.0.0-alpha.1
- Generate CRDs with original controller-tools and markers
- Add discoveryGVK for WorkloadSpread
- Add
--nodeimage-creation-delay
flag to delay NodeImage creation after Node ready
- Fix E2E for WorkloadSpread, ImagePulling, ContainerLaunchPriority
Change log since v0.10.0
- Add discoveryGVK for WorkloadSpread
- Optimize webhook injection
- Setup generic kubeClient with Protobuf
- Fix E2E for WorkloadSpread, ImagePulling
Change log since v0.10.0
- Bump CustomResourceDefinition(CRD) from v1beta1 to v1
- Bump ValidatingWebhookConfiguration/MutatingWebhookConfiguration from v1beta1 to v1
- Bump dependencies: k8s v1.18 -> v1.20, controller-runtime v0.6.5 -> v0.8.3
So that Kruise can install into Kubernetes 1.22 and no longer support Kubernetes < 1.16.
When update spec.template.metadata.labels/annotations
in CloneSet or Advanced StatefulSet and there exists container env from the changed labels/annotations,
Kruise will in-place update them to renew the env value in containers.
Container Launch Priority provides a way to help users control the sequence of containers start in a Pod.
It works for Pod, no matter what kind of owner it belongs to, which means Deployment, CloneSet or any other Workloads are all supported.
- Manage the pods that were created before WorkloadSpread.
- Optimize webhook update and retry during injection.
- Add pod no pub-protection annotation.
- PUB controller watch workload replicas changed.
- Support in-place update daemon pod.
- Fix SidecarSet filter active pods.
- Kruise-daemon watch pods using protobuf.
- Export resync seconds args.
- Fix http checker reload ca.cert.
Kubernetes offers Pod Disruption Budget (PDB) to help you run highly available applications even when you introduce frequent voluntary disruptions. PDB limits the number of Pods of a replicated application that are down simultaneously from voluntary disruptions. However, it can only constrain the voluntary disruption triggered by the Eviction API. For example, when you run kubectl drain, the tool tries to evict all of the Pods on the Node you're taking out of service.
PodUnavailableBudget can achieve the effect of preventing ALL application disruption or SLA degradation, including pod eviction, deletion, inplace-update, ...
WorkloadSpread supports to constrain the spread of stateless workload, which empowers single workload the abilities for multi-domain and elastic deployment.
It can be used with those stateless workloads, such as CloneSet, Deployment, ReplicaSet and even Job.
- Scale-down supports topology spread constraints. doc
- Fix in-place update pods in Updated state.
- Add imagePullSecrets field to support pull secrets for the sidecar images. doc
- Add injectionStrategy.paused to stop injection temporarily. doc
- Support image pre-download for in-place update, which can accelerate the progress of applications upgrade. doc
- Support scaling with rate limit. doc
- Fix rolling update stuck caused by deleting terminating pods.
- Bump to Kubernetes dependency to 1.18
- Add pod informer for kruise-daemon
- More
kubectl ... -o wide
fields for kruise resources
[doc]
ContainerRecreateRequest provides a way to let users restart/recreate one or more containers in an existing Pod.
[doc]
This feature provides a safety policy which could help users protect Kubernetes resources and applications' availability from the cascading deletion mechanism.
- Support
pod-deletion-cost
to let users set the priority of pods deletion. [doc] - Support image pre-download for in-place update, which can accelerate the progress of applications upgrade. [doc]
- Add
CloneSetShortHash
feature-gate, which solves the length limit of CloneSet name. [doc] - Make
maxUnavailable
andmaxSurge
effective for specified deletion. [doc] - Support efficient update and rollback using
partition
. [doc]
- Support sidecar container hot upgrade. [doc]
- Add
podSelector
to pull image on nodes of the specific pods.
- Optimize cri-runtime for kruise-daemon
- Fix broadcastjob expectation observed when node assigned by scheduler
- The flags for kruise-manager must start with
--
instead of-
. If you install Kruise with helm chart, ignore this. - SidecarSet has been refactored. Make sure there is no SidecarSet being upgrading when you upgrade Kruise, and read the latest doc for SidecarSet.
- A new component named
kruise-daemon
comes in. It is deployed in kruise-system using DaemonSet, defaults on every Node.
Now Kruise includes two components:
- kruise-controller-manager: contains multiple controllers and webhooks, deployed using Deployment.
- kruise-daemon: contains bypass features like image pre-download and container restart in the future, deployed using DaemonSet.
Kruise will create a NodeImage for each Node, and its spec
contains the images that should be downloaded on this Node.
Also, users can create an ImagePullJob CR to declare an image should be downloaded on which nodes.
apiVersion: apps.kruise.io/v1alpha1
kind: ImagePullJob
metadata:
name: test-imagepulljob
spec:
image: nginx:latest
completionPolicy:
type: Always
parallelism: 10
pullPolicy:
backoffLimit: 3
timeoutSeconds: 300
selector:
matchLabels:
node-label: xxx
- Refactor the controller and webhook for SidecarSet:
- For
spec
:- Add
namespace
: indicates this SidecarSet will only inject for Pods in this namespace. - For
spec.containers
:- Add
podInjectPolicy
: indicates this sidecar container should be injected in the front or end ofcontainers
list. - Add
upgradeStrategy
: indicates the upgrade strategy of this sidecar container (currently it only supportsColdUpgrade
) - Add
shareVolumePolicy
: indicates whether to share other containers' VolumeMounts in the Pod. - Add
transferEnv
: can transfer the names of env shared from other containers.
- Add
- For
spec.updateStrategy
:- Add
type
: containsNotUpdate
orRollingUpdate
. - Add
selector
: indicates only update Pods that matched this selector. - Add
partition
: indicates the desired number of Pods in old revisions. - Add
scatterStrategy
: defines the scatter rules to make pods been scattered during updating.
- Add
- Add
- For
- Add
currentRevision
field in status. - Optimize CloneSet scale sequence.
- Fix condition for pod lifecycle state from Updated to Normal.
- Change annotations
inplace-update-state
=>apps.kruise.io/inplace-update-state
,inplace-update-grace
=>apps.kruise.io/inplace-update-grace
. - Fix
maxSurge
calculation when partition > replicas.
- Support Deployment as template in UnitedDeployment.
- Support lifecycle hook for in-place update and pre-delete.
- Add PodFitsResources predicates.
- Add
--assign-bcj-pods-by-scheduler
flag to control whether to use scheduler to assign BroadcastJob's Pods.
- Add feature-gate to replace the CUSTOM_RESOURCE_ENABLE env.
- Add GetScale/UpdateScale into clientsets for scalable resources.
- Support multi-platform build in Makefile.
- Set different user-agent for controllers.
Since v0.7.0:
- OpenKruise requires Kubernetes 1.13+ because of CRD conversion.
Note that for Kubernetes 1.13 and 1.14, users must enable
CustomResourceWebhookConversion
feature-gate in kube-apiserver before install or upgrade Kruise. - OpenKruise official image supports multi-arch, by default including linux/amd64, linux/arm64, and linux/arm platforms.
Thanks for @rishi-anand contributing!
An enhanced version of CronJob, it supports multiple kind in a template:
apiVersion: apps.kruise.io/v1alpha1
kind: AdvancedCronJob
spec:
template:
# Option 1: use jobTemplate, which is equivalent to original CronJob
jobTemplate:
# ...
# Option 2: use broadcastJobTemplate, which will create a BroadcastJob object when cron schedule triggers
broadcastJobTemplate:
# ...
# Options 3(future): ...
- Partition support intOrStr format
- Warning log for expectation timeout
- Remove ownerRef when pod's labels not matched CloneSet's selector
- Allow updating revisionHistoryLimit in validation
- Fix resourceVersionExpectation race condition
- Fix overwrite gracePeriod update
- Fix webhook checking podsToDelete
- Promote Advanced StatefulSet to v1beta1
- A conversion webhook will help users to transfer existing and new
v1alpha1
advanced statefulsets tov1beta1
automatically - Even all advanced statefulsets have been converted to
v1beta1
, users can still get them throughv1alpha1
client and api
- A conversion webhook will help users to transfer existing and new
- Support reserveOrdinal for Advanced StatefulSet
- Add validation webhook for DaemonSet
- Fix pending pods created by controller
- Optimize the way to calculate parallelism
- Check ownerReference for filtered pods
- Add pod label validation
- Add ScaleExpectation for BroadcastJob
- Initializing capabilities if allowPrivileged is true
- Support secret cert for webhook with vip
- Add rate limiter config
- Fix in-place rollback when spec image no latest tag
- Support lifecycle hooks for pre-delete and in-place update
- Fix map concurrent write
- Fix current revision during rollback
- Fix update expectation for pod deletion
- Support initContainers definition and injection
- Support to define CloneSet as UnitedDeployment's subset
- Support minReadySeconds strategy
- Add webhook controller to optimize certs and configurations generation
- Add pprof server and flag
- Optimize discovery logic in custom resource gate
- Update dependencies: k8s v1.13 -> v1.16, controller-runtime v0.1.10 -> v0.5.7
- Support multiple active webhooks
- Fix CRDs using openkruise/controller-tools
An enhanced version of default DaemonSet with extra functionalities such as:
- inplace update and surging update
- node selector for update
- partial update
- Not create excessive pods when updating with maxSurge
- Round down maxUnavaliable when maxSurge > 0
- Skip recreate when inplace update failed
- Fix scale panic when replicas < partition
- Fix CloneSet blocked by terminating PVC
- Support
maxSurge
strategy which could work well withmaxUnavailable
andpartition
- Add CloneSet core interface to support multiple implementations
- Fix in-place update for metadata in template
- Make sure
maxUnavailable
should not be less than 1 - Fix in-place update for metadata in template
- Merge volumes during injecting sidecars into Pod
- Expose
CUSTOM_RESOURCE_ENABLE
env by chart set option
- Add
labelSelector
to optimize scale subresource for HPA - Add
minReadySeconds
,availableReplicas
fields for CloneSet - Add
gracePeriodSeconds
for graceful in-place update
- Support label selector in scale for HPA
- Add
gracePeriodSeconds
for graceful in-place update
- Fix StatefulSet default update sequence
- Fix ControllerRevision adoption
- Fix
check_for_installation.sh
script for k8s 1.11 to 1.13
Mainly focuses on managing stateless applications. (Concept for CloneSet)
It provides full features for more efficient, deterministic and controlled deployment, such as:
- inplace update
- specified pod deletion
- configurable priority/scatter update
- preUpdate/postUpdate hooks
- UnitedDeployment supports both StatefulSet and AdvancedStatefulSet.
- UnitedDeployment supports toleration config in subset.
- Fix statefulset inplace update fields in pod metadata such as labels/annotations.
- Simplify installation with helm charts, one simple command to install kruise charts, instead of downloading and executing scripts.
- Support priority update, which allows users to configure the sequence for Pods updating.
- Fix maxUnavailable calculation, which should not be less than 1.
- Fix BroadcastJob cleaning up after TTL.
- Provide a script to check if the K8s cluster has enabled MutatingAdmissionWebhook and ValidatingAdmissionWebhook admission plugins before installing Kruise.
- Users can now install specific controllers if they only need some of the Kruise CRD/controllers.
- Fix a jsonpatch bug by updating the vendor code.
- Add condition report in
status
to indicate the scaling or rollout results.
- Define a set of APIs for UnitedDeployment workload which manages multiple workloads spread over multiple domains in one cluster.
- Create one workload for each
Subset
inTopology
. - Manage Pod replica distribution across subset workloads.
- Rollout all subset workloads by specifying a new workload template.
- Manually manage the rollout of subset workloads by specifying the
Partition
of each workload.
- Three blog posts are added in Kruise website, titled:
- Kruise Controller Classification Guidance.
- Learning Concurrent Reconciling.
- UnitedDeploymemt - Supporting Multi-domain Workload Management.
- New documents are added for UnitedDeployment, including a tutorial.
- Revise main README.md.
- Provide a script to generate helm charts for Kruise. User can specify the release version.
- Automatically install kubebuilder if it does not exist in the machine.
- Add Kruise uninstall script.
- Fix a potential controller crash problem when APIServer disables MutatingAdmissionWebhook and ValidatingAdmissionWebhook admission plugins.
- Change the type of
Parallelism
field in BroadcastJob from*int32
tointOrString
. - Support
Pause
in BroadcastJob. - Add
FailurePolicy
in BroadcastJob, supportingContinue
,FastFailed
, andPause
polices. - Add
Phase
in BroadcastJobstatus
.
- Allow parallelly upgrading SidecarSet Pods by specifying
MaxUnavailable
. - Support sidecar volumes so that user can specify volume mount in sidecar containers.
- Support to run kruise-controller-manager locally
- Allow selectively install required CRDs for kruise controllers
- Remove
sideEffects
in kruise-manager all-in-one YAML file to avoid start failure
- Add MaxUnavailable rolling upgrade strategy
- Add In-Place pod update strategy
- Add paused functionality during rolling upgrade
- Add BroadcastJob that runs pods on all nodes to completion
- Add
Never
termination policy to have job running after it finishes all pods - Add
ttlSecondsAfterFinished
to delete the job after it finishes in x seconds.
- Make broadcastjob honor node unschedulable condition
- Add SidecarSet that automatically injects sidecar container into selected pods
- Support sidecar update functionality for SidecarSet