This repository has been archived by the owner on Apr 4, 2023. It is now read-only.

Cassandra scale in action #289

Open

kragniz wants to merge 9 commits into jetstack:master from kragniz:scale-down-action

Contributor

kragniz commented Mar 16, 2018

What this PR does / why we need it: add an action for scaling in cassandra clusters

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #285

Special notes for your reviewer: this is based on top of #256, will be rebased on master when that gets merged

Allow scaling in cassandra clusters

jetstack-bot added release-note do-not-merge/work-in-progress needs-rebase size/XXL labels

jetstack-bot requested a review from munnerz

March 16, 2018 09:51

Collaborator

jetstack-bot commented Mar 16, 2018

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: munnerz

Assign the PR to them by writing /assign @munnerz in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

kragniz force-pushed the scale-down-action branch from 07bea31 to 9d2152f Compare

April 10, 2018 14:26

jetstack-bot added size/XL and removed needs-rebase size/XXL labels

jetstack-bot added the needs-rebase label

kragniz added 5 commits

May 8, 2018 17:46


          Add decommissioning flags to api

1407bca


          First pass at scaleIn action

ef31e77


          Add pilot informer

b19d792


          Perform scalein based on statefulset size

b81c229


          Fix pilot selection logic

fa9a183

kragniz force-pushed the scale-down-action branch from 9d2152f to fa9a183 Compare

May 8, 2018 16:46

jetstack-bot removed the needs-rebase label

kragniz added 4 commits

May 9, 2018 13:14


          Update changed types

22461c6


          Fix scalein_test

51df0d1


          Fix action fuzzing test

7fdbe4e


          Rename PilotCassandraSpec -> CassandraPilotSpec

c162d8d

kragniz changed the title ~~WIP: Cassandra scale in action~~ Cassandra scale in action

jetstack-bot removed the do-not-merge/work-in-progress label

wallrj suggested changes

View reviewed changes

Member

wallrj left a comment •

edited

Loading

Thanks @kragniz

Add some documentation
Add an E2E test.
See if there's a way we can query the cassandra process to learn if this node has been decommissioned. I'm nervous about setting the Decommissioned flag in the pilot status based only on the success of the nodetool decommission sub-command. E.g. we'd like to be able to report a decommissioned node even if someone has manually run nodetool decommission.
I'm also nervous about decommissioning all the nodes in parallel. There are a few people saying they should be done one-at-a-time.
- https://groups.google.com/forum/#!topic/nosql-databases/f5wCmkD6-U4
  *https://serverfault.com/questions/615120/cassandra-moving-data-from-multiple-nodes-to-a-single-node
- https://stackoverflow.com/questions/34059076/how-to-safely-decommission-analytics-dc-in-cassandra
Maybe it's time to rethink the liveness check. The decommissionInProgress check on the pilot will break if the pilot process gets restarted.

pkg/controllers/cassandra/actions/scalein.go

+              			return fmt.Errorf(
+              				"Not enough pilots to scale down: %d",
+              				len(pilots),
+              			)

Member

wallrj May 15, 2018

Maybe log an error here and return nil.
The controller interprets an error to mean that the action should be retried.
But retrying will never succeed in this situation....I think.

pkg/controllers/cassandra/actions/scalein.go

+              					corev1.EventTypeNormal,
+              					a.Name(),
+              					"Marked cassandra pilot %s for decommission", p.Name,
+              				)

Member

wallrj May 15, 2018

Log the namespace and name of the decommissioned pilot here.
And maybe the event should be recorded against the cluster object, so that an administrator can see all the events for a particular cluster.

pkg/controllers/cassandra/actions/scalein.go

+              				"is greater than the existing StatefulSet.Replicas value (%d)",
+              			a.NodePool.Replicas, *ss.Spec.Replicas,
+              		)
+              	}

Member

wallrj May 15, 2018

Might be neater to start the function with checks for all the bad inputs and exit early. Then end with the happy path.
And I think this error should be logged rather than returned, to prevent the controller retrying this doomed action.

pkg/controllers/cassandra/actions/scalein.go

+              			}
+              		}
+              		errs = append(errs, fmt.Errorf("pilot %q not found", pilotName))
+              	}

Member

wallrj May 15, 2018

I couldn't figure out why we need this inner loop.
I imagined we'd get a list of all pilots for the nodepool and decommission all the ones with an index higher than the desired replica count.
Maybe this can be simplified? Or else add a function doc explaining the algorithm.

pkg/controllers/cassandra/actions/scalein.go

+              				Message: utilerror.NewAggregate(errs).Error(),
+              				Reason:  metav1.StatusReasonNotFound,
+              			},
+              		}

Member

wallrj May 15, 2018

Never seen k8sErrors.StatusError used before. What's the purpose? I looked to see if we check for this error elsewhere, but couldn't see any such checks.

pkg/controllers/cassandra/cluster_control.go

+              			return &actions.ScaleIn{
+              				Cluster:  c,
+              				NodePool: &np,
+              			}, nil

Member

wallrj May 15, 2018

I'm keen to keep this function unchanged :-)
I think it should be possible to instead check nps.ReadyReplicas here...if not, let's discuss.

pkg/controllers/cassandra/cluster_control_test.go

+              		a, err := cassandra.NextAction(c, state.StatefulSetLister)
+              		if err != nil {
+              			t.Errorf("error calculating next action: %v", err)
+              		}

Member

wallrj May 15, 2018

And if we could keep the NextAction signature unchanged, we wouldn't need to add this extra stuff to the test.

pkg/pilot/cassandra/v3/pilot.go

+              func run(args ...string) error {
+              	cmd := exec.Command(args[0], args[1:]...)
+              	cmd.Stdout = os.Stdout
+              	cmd.Stderr = os.Stderr

Member

wallrj May 15, 2018

I think we should capture the stdout / stderr here and also add it to the error that we return.
Otherwise I think it'll be difficult to diagnose decommission failures.

pkg/pilot/cassandra/v3/pilot.go

+              		return run("nodetool", "decommission")
+              	}
+              	return nil

Member

wallrj May 15, 2018

I think we should return an error if the node state != NodeStateNormal.
Otherwise the unhealthy node will be marked as decommissioned in the pilot status, right?

pkg/pilot/cassandra/v3/pilot.go

+              	if p.decommissionInProgress || true {
+              		glog.Info("decommission in progress, reporting success for liveness")
+              		return nil
+              	}

Member

wallrj May 15, 2018

Yikes, sorry about this.
I remember you saying that jolokia nodetool requests fails if the node has been decommissioned.
I wonder if we need to rethink this check.

jetstack-bot added the needs-rebase label

Collaborator

jetstack-bot commented May 15, 2018

@kragniz: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.