-
Notifications
You must be signed in to change notification settings - Fork 31
Create Cassandra Pilot resources #153
Create Cassandra Pilot resources #153
Conversation
|
93abcb6
to
8d2c57d
Compare
8d2c57d
to
fd4494e
Compare
fd4494e
to
faf367c
Compare
/test e2e |
faf367c
to
b56499f
Compare
/test e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - only a few changes to make!
} | ||
} | ||
} | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'm not too sure if we should be doing this right now (deleting 'unused' pilot resources). It's difficult to define unused, and in fact, the lack of a pod does not immediately imply that the Pilot resource is no longer in use. I'd rather do something along the lines of waiting N seconds since the last heartbeat (provided no corresponding pod exists) before deleting.
Right now the Elasticsearch controller doesn't actually clean up old Pilots, it just cleans up old StatefulSets (i.e. ones no longer specified as a nodepool on the ElasticsearchCluster resource). Should we make these the same? Consistency in this sort of thing probably makes it easier to reason about. At some point I think we can refactor the Pilot control loop to make it mostly generic between DB types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'm not too sure if we should be doing this right now (deleting 'unused' pilot resources). It's difficult to define unused, and in fact, the lack of a pod does not immediately imply that the Pilot resource is no longer in use. I'd rather do something along the lines of waiting N seconds since the last heartbeat (provided no corresponding pod exists) before deleting.
But what's the bug that the heartbeat / delay would fix?
If there is a bug, then I can add the heartbeat / delay in a followup branch.
But for now, removing unused Pilot resources seems like good housekeeping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The heartbeat is a suggestion for how it can be handled, but as you say, should be in a follow up.
I think that we can't say the lack of pod = the pilot doesn't exist. In the ES controller for example, the 'master' Pilot uses Pilot resources to determine whether a cluster is in a state ready to scale down for example. This scenario, as far as I can tell, will cause failures if we auto-delete Pilots if no Pod exists:
- User requests scale down of cluster
- Navigator updates a Pilot resource to 'decommission' it
- Because the user is a sadist, or because something happened, the pod that is being decommissioned gets deleted.
- This should cause the StatefulSet controller to re-create the pod, really, as the Pilot is not finished decommissioning so it needs to come back in order to finish
- However, because we immediately deleted the Pilot resource upon the Pod disappearing, the 'knowledge' of those remaining documents/indices left on that node (i.e. the
pilot.status.elasticsearch.documentCount
field) has been lost. - So now our master Pilot is not aware of those extra documents, nor the fact that a Pilot is being decommissioned either (as that knowledge has also been lost).
- At this point we've reached a bit of a weird state - especially if we consider that the only state about the state of the world is stored in the k8s api and anything in-memory should not be relied upon
I do agree we need to do house keeping, but I'm thinking we should consider how this is done carefully as 'unused' is a tricky term. Additionally, given the Pilot resource in the API is one of the users primary points for debugging issues with their clusters, by deleting the Pilots so frequently we are also destroying a lot of useful debugging information for users. I'd much prefer the behaviour between all DB types to be consistent - the first steps in debugging the ES cluster should be the same as Cassandra, and vice versa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ my only outstanding/current issue with this PR before I'm happy to merge 😄 .
Let me know if you disagree, but any change we make here needs to be consistent with ES too given this touches a user-facing resource (the Pilot resource)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that in the scenario above,
- Accidentally deleted pod gets restarted by StatefulSet Controller.
- Pilot process waits for appearance of Pilot
- Pilot starts ES sub-process
- Pilot re-updates the Pilot status with document count 0
- navigator controller waits for document count 0 and then decrements the Statefulset replicas count
- and removes the drained pod and corresponding pilot.
But I agree that we should keep ES and Cass in sync, so I'll remove the Pilot cleanup code for now.
} | ||
desiredPilot = existingPilot.DeepCopy() | ||
updatePilotForCluster(cluster, pod, desiredPilot) | ||
_, err = client.Update(desiredPilot) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So from what I can see here, this will cause Navigator to go into a sync loop, continuously updating the Pilot resource. We don't ever check if the Pilot is already reconciled (e.g. it's spec is up to date), so we end out updating to contain the same spec as is already there. This will still however increment the ResourceVersion, thus causing this loop to be triggered once again (and so on).
There are a couple of ways (that I can think of) to deal with this:
-
Perform a
reflect.DeepEqual
on thepilot.Spec
before calling Update. This has the downside of being inefficient (DeepEqual is an expensive call), however it is accurate and easy to do. -
Write a hash of the spec into the annotations of the Pilot, so that we can quickly compare the old and new hashes.
-
Manually compare each field, but this is tedious and error prone.
Can you think of anything else here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, we need to update the strategy.go file for Cassandra resources to not perform updates to the cassandracluster.status
block when plain Update is called (and instead, UpdateStatus
should be used if the Status block needs updating).
Right now, each time Update is called we will wipe out all fields in Status, causing Pilots and Navigator to fight with each other indefinitely (causing more loops)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Write a hash of the spec into the annotations of the Pilot, so that we can quickly compare the old and new hashes.
I've gone with that option. I found the ES hashing code and adapted that. Also added labels to the hash since the controller updates those too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, each time Update is called we will wipe out all fields in Status, causing Pilots and Navigator to fight with each other indefinitely (causing more loops)
I don't think that's the case. I'm taking the latest Pilot from the lister and then updating labels and in future, spec Not replacing Pilot.Status.
appslisters "k8s.io/client-go/listers/apps/v1beta1" | ||
) | ||
|
||
func PodControlledByCluster( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also update the Elasticsearch controller to use this function too? This version doesn't depend on any types like *ElasticsearchCluster like the one the the controller currently uses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pilot := &v1alpha1.Pilot{} | ||
ownerRefs := pilot.GetOwnerReferences() | ||
ownerRefs = append(ownerRefs, util.NewControllerRef(cluster)) | ||
pilot.SetOwnerReferences(ownerRefs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need so many steps to do this bit - we aren't mutating an existing Pilot, but creating a new one instead (so don't need to use function accessors)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
) *v1alpha1.Pilot { | ||
pilot.SetName(pod.GetName()) | ||
pilot.SetNamespace(cluster.GetNamespace()) | ||
pilot.SetLabels(util.ClusterLabels(cluster)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be careful not to override any user provided labels, or labels that may have been added by the pilot (we don't do this right now, but we might do at some point)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
b56499f
to
cefc69a
Compare
cefc69a
to
a619d42
Compare
* For every pod in the nodepool, create a corresponding Pilot resource. * Ignore Pods that are not owned by the cluster. * Stop and error if we encounter Pilots with expected name, but not owned by the cluster. Fixes: jetstack#152
a619d42
to
a4c6d7a
Compare
/test e2e |
/retest |
/retest |
1 similar comment
/retest |
I restarted the build infra workers /retest |
/retest |
p.ObjectMeta, | ||
p.Labels, | ||
} | ||
hasher := fnv.New32() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be pulled out to be a global var (to save calling New32 each time)?
Happy to be a follow up as I'll be making similar changes in ES
c.statefulSets, | ||
) | ||
if err != nil { | ||
return clusterPods, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably return nil, err
here instead of a partial list of pods.
} | ||
err = util.OwnerCheck(existingPilot, cluster) | ||
if err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: should there be some way to detect that this function has failed because the existing pilot is owned by another cluster? (not important for merge)
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: munnerz The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/test all [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue. |
Fixes: #152
Release note: