-
Notifications
You must be signed in to change notification settings - Fork 31
Allow configuring number of seed nodes per nodepool #264
base: master
Are you sure you want to change the base?
Conversation
I'll add an e2e test for this once #258 is merged |
5d712ff
to
179ba9c
Compare
/retest |
@kragniz PR needs rebase |
hack/e2e.sh
Outdated
<(envsubst \ | ||
'$NAVIGATOR_IMAGE_REPOSITORY:$NAVIGATOR_IMAGE_TAG:$NAVIGATOR_IMAGE_PULLPOLICY:$CASS_NAME:$CASS_REPLICAS:$CASS_CQL_PORT' \ | ||
< "${SCRIPT_DIR}/testdata/cass-cluster-test.template.yaml") | ||
apply_cassandracluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we fail the test if this fails? (like we do above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
hack/e2e.sh
Outdated
<(envsubst \ | ||
'$NAVIGATOR_IMAGE_REPOSITORY:$NAVIGATOR_IMAGE_TAG:$NAVIGATOR_IMAGE_PULLPOLICY:$CASS_NAME:$CASS_REPLICAS:$CASS_CQL_PORT' \ | ||
< "${SCRIPT_DIR}/testdata/cass-cluster-test.template.yaml") | ||
apply_cassandracluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we fail the test if this fails? (like we do above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
hack/e2e.sh
Outdated
@@ -325,6 +320,13 @@ function test_cassandracluster() { | |||
fail_test "Second cassandra node did not become ready" | |||
fi | |||
|
|||
seed_label=$(kubectl get pods --namespace "${namespace}" \ | |||
cass-${CASS_NAME}-ringnodes-1 \ | |||
-o jsonpath='{.metadata.labels.seed}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(may not be something that's true because of this PR, but we should always be using namespaced labels e.g. navigator.jetstack.io/cassandra-seed=true
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 this needs rebasing when #270 lands, which makes the label navigator.jetstack.io/cassandra-seed
@@ -7,3 +7,13 @@ import ( | |||
func addDefaultingFuncs(scheme *runtime.Scheme) error { | |||
return RegisterDefaults(scheme) | |||
} | |||
|
|||
func SetDefaults_CassandraClusterSpec(spec *CassandraClusterSpec) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why default the cassandraclusterspec if we're only going to then iterate over each node pool in it anyway?
Surely this is better to be a SetDefaults_CassandraClusterNodePool
function?
This leads onto my next question.. what if a user wants to ensure there are 0 seeds in a given node pool (as they have seeds in other node pools). Is this something we want to disallow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, it'd be cleaner with SetDefaults_CassandraClusterNodePool
.
We need to ensure that there's more than 0 seeds across all nodepools. We could reduce the minimum to 0 in a particular nodepool if we find a use-case for it (I can't find a reason to do it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool - I've no issue requiring minimum one seed per node pool for now. We can revisit it in future if needed like you say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened #272 so we remember
|
||
if np.Seeds < 0 { | ||
allErrs = append(allErrs, | ||
field.Invalid(fldPath.Child("seeds"), np.Seeds, "number of seeds must be 1 or greater"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... must be greater than or equal to 1
} | ||
|
||
// only label if the current label is incorrect | ||
if pod.Labels["seed"] != desiredLabel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, not something created by this PR, but seed
isn't an acceptable label imo. difficult for a user to understand why it is there, and could be removed by accident.
2302e30
to
f3210eb
Compare
d2e1867
to
10b6cc6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kragniz
Looks good.
A few questions:
- I wonder if administrators will know how many seeds they need.
- Did you consider just hard coding it to 3 seeds per nodepool?
- Or perhaps making it a
seedsPerNodes
attribute...which might allow the administrator to ask for e.g. 1 seed per 10 C* nodes? - And in that case, perhaps it could be a cluster wide setting rather than a per-nodepool setting?
- Is there any problem with having different numbers of seeds per rack / DC?
In addition I left a few comments and questions below. Please answer or address those before merging.
pkg/apis/navigator/v1alpha1/types.go
Outdated
@@ -64,6 +64,11 @@ type CassandraClusterNodePool struct { | |||
// in this nodepool. If this is not set, a default will be selected. | |||
// +optional | |||
Datacenter string `json:"datacenter"` | |||
|
|||
// Seeds specifies the number of seed nodes to alocate in this nodepool. By |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo "alocate"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -187,6 +187,15 @@ if [[ "test_elasticsearchcluster" = "${TEST_PREFIX}"* ]]; then | |||
kube_delete_namespace_and_wait "${ES_TEST_NS}" | |||
fi | |||
|
|||
function apply_cassandracluster() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❔ The other bash functions have parameters (of sorts) so it might be nice to be consistent. On the other hand it looks like we're moving away from bash based E2E tests to happy for you to leave this for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it'd be nice-to-have a check that all the environment variables that are about to be substituted are actually set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's fix these e2e things when we move to ginkgo
if ! apply_cassandracluster | ||
then | ||
fail_test "Failed to apply cassandracluster" | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❔ personally, I wouldn't bother with the fail_test
part here. I don't regard "applying the desired configuration" as an E2E test....just an implementation detail of the test....so I wouldn't bother adding a specific test error message.
Also, if this command fails, then all bets are off and we should probably just exit the test so that it's easy to debug the problem.
But if you prefer to leave as-is, until we land the gingko test branch, I'm happy with that.
// default to 1 seed if not specified | ||
if np.Seeds == 0 { | ||
np.Seeds = 1 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't the validation prevent seeds: 0
?
allErrs = append(allErrs, | ||
field.Invalid(fldPath.Child("seeds"), np.Seeds, "number of seeds must be greater than or equal to 1"), | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The message says "must be greater than or equal to 1" but the test is np.Seeds < 0
pod, err := c.pods.Pods(cluster.Namespace).Get(fmt.Sprintf("%s-%d", set.Name, i)) | ||
if err != nil { | ||
glog.Warningf("Couldn't get stateful set pod: %v", err) | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❔ I'm not sure about this. In other controllers we would return this error, but I can see why we'd want to continue and attempt to label other pods if one of the Get
s fails.
Could we collect the errors and return an aggregate error at the end?
See k8s.io/apimachinery/pkg/util/errors
NewAggregate
set, err := c.statefulSetLister.StatefulSets(cluster.Namespace).Get(setName) | ||
if err != nil { | ||
glog.Warningf("Couldn't get stateful set: %v", err) | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe an aggregate error here too.
navObjects: []runtime.Object{cluster}, | ||
cluster: cluster, | ||
assertions: func(t *testing.T, state *controllers.State) { | ||
CheckSeedLabel(pod2.Name, "", pod2.Namespace, t, state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ Add a test for deleting label when the seeds value is reduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
10b6cc6
to
92fbfe0
Compare
I've changed |
That 1.7 E2E test failure is a bit weird.
The command |
@kragniz PR needs rebase |
pkg/util/util.go
Outdated
@@ -11,3 +11,7 @@ func CalculateQuorum(num int32) int32 { | |||
} | |||
return (num / 2) + 1 | |||
} | |||
|
|||
func Int64Ptr(i int64) *int64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be removed since #296
8988673
to
3541a6d
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@kragniz: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Should we merge this now, or wait until the actions stuff has merged and redesign this for that new structure? (targeting 0.2) |
@kragniz: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This adds a new field to cassandra nodepools,
seeds
, which controls the number of seed nodes in that nodepool. This defaults to 1, and cannot be greater than the number of replicas.