Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support specified instance scale down #6958

Merged
merged 27 commits into from
Apr 8, 2024

Conversation

free6om
Copy link
Contributor

@free6om free6om commented Apr 2, 2024

Use Cases

Node Failure

When a physical fault occurs on a specific node, it is necessary to rebuild a replica and subsequently take the affected pod on that node offline.

Data Corruption

When the data of a particular pod is corrupted, it is necessary to rebuild a replica and subsequently take the affected pod offline.

Instance Unavailability

When a pod experiences availability issues such as slow or unresponsive behavior, the best practice is to create a new replica and subsequently take the affected pod offline.

Cluster API

Add the OfflineInstances field to spec.componentSpecs in the Cluster API to describe the instances to be taken offline.

apiVersion: apps.kubeblocks.io/v1alpha1
kind: Cluster
# ...
spec:
  componentSpecs:
  - name: "foo"
    offlineInstances: [ "foo-2", "foo-3"]
# ...

OpsRequest API

Add the offlineInstances field to Ops to override the field in the Cluster.

apiVersion: apps.kubeblocks.io/alpha1
kind: OpsRequest
# ...
spec:
  # ...
  horizontalScaling:
  - componentName: "foo"
    replicas: 2
    offlineInstances:
    - "foo-2"
    - "foo-3"
# ...

Test

Case 1: Specify Instance Offline

Create a 3-instance cluster and use Ops to specify taking the instance with ordinal 1 offline.
Expected result:

  1. The Cluster Spec should include OfflineInstances:
apiVersion: apps.kubeblocks.io/v1alpha1
kind: Cluster
# ...
spec:
  componentSpecs:
  - name: "foo"
    offlineInstances: ["foo-1"]
# ...
  1. The RSM should actually generate 2 instances with ordinal 0 and 2.

@free6om free6om added this to the Release 0.9.0 milestone Apr 2, 2024
@free6om free6om self-assigned this Apr 2, 2024
@github-actions github-actions bot added the size/XXL Denotes a PR that changes 1000+ lines. label Apr 2, 2024
apis/apps/v1alpha1/cluster_types.go Outdated Show resolved Hide resolved
controllers/apps/operations/horizontal_scaling.go Outdated Show resolved Hide resolved
controllers/apps/operations/horizontal_scaling.go Outdated Show resolved Hide resolved
Copy link

codecov bot commented Apr 7, 2024

Codecov Report

Attention: Patch coverage is 65.01767% with 99 lines in your changes are missing coverage. Please review.

Project coverage is 65.77%. Comparing base (1b6ef23) to head (1d29e88).
Report is 4 commits behind head on main.

Files Patch % Lines
pkg/controller/rsm2/instance_util.go 78.76% 20 Missing and 11 partials ⚠️
pkg/controller/component/rsm_convertor.go 0.00% 29 Missing ⚠️
controllers/apps/operations/horizontal_scaling.go 60.00% 14 Missing and 4 partials ⚠️
pkg/controller/rsm2/reconciler_revision_update.go 56.52% 6 Missing and 4 partials ⚠️
...g/controller/rsm2/reconciler_instance_alignment.go 80.00% 2 Missing and 2 partials ⚠️
pkg/controller/rsm2/reconciler_update.go 50.00% 3 Missing and 1 partial ⚠️
pkg/controller/builder/builder_component.go 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6958      +/-   ##
==========================================
- Coverage   65.96%   65.77%   -0.19%     
==========================================
  Files         340      340              
  Lines       41356    41391      +35     
==========================================
- Hits        27279    27225      -54     
- Misses      11754    11835      +81     
- Partials     2323     2331       +8     
Flag Coverage Δ
unittests 65.77% <65.01%> (-0.19%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

//
// The sum of replicas across all InstanceTemplates should not exceed the total number of Replicas specified for the Component.
// Any remaining replicas will be generated using the default template and will follow the default naming rules.
//
// +optional
Instances []InstanceTemplate `json:"instances,omitempty"`
Copy link
Contributor

@wangyelei wangyelei Apr 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you add patchStrategy:"merge,retainKeys" patchMergeKey:"name" to verify that template name is the unique key

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in: 1d29e88

apis/apps/v1alpha1/opsrequest_types.go Show resolved Hide resolved
@free6om free6om merged commit f88b651 into main Apr 8, 2024
57 checks passed
@free6om free6om deleted the support/specified-pod-scale-in branch April 8, 2024 07:09
@free6om
Copy link
Contributor Author

free6om commented Apr 8, 2024

/cherry-pick release-0.9

Copy link

github-actions bot commented Apr 8, 2024

🤖 says: cherry pick action finished successfully 🎉!
See: https://github.com/apecloud/kubeblocks/actions/runs/8595993678

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci feature size/XXL Denotes a PR that changes 1000+ lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants