Skip to content
This repository has been archived by the owner on Apr 4, 2023. It is now read-only.

Create Cassandra Pilot resources #153

Merged

Conversation

wallrj
Copy link
Member

@wallrj wallrj commented Nov 27, 2017

  • For every pod in the nodepool, create a corresponding Pilot resource.
  • Delete Pilot resources for which there is no corresponding Pod.
  • Ignore Pods that are not owned by the cluster.
  • Stop and error if we encounter Pilots with expected name, but not owned by the cluster.

Fixes: #152

Release note:

NONE

@wallrj wallrj changed the title Create Cassandra Pilot resources WIP: Create Cassandra Pilot resources Nov 28, 2017
@wallrj
Copy link
Member Author

wallrj commented Nov 28, 2017

This isn't working yet.
Forgot that the ownerreferences are to the statefulset not to the cassandracluster.
Looking at whether it's possible to set additional ownerreferences from the statefulset template.

@wallrj wallrj force-pushed the 152-cassandra-pilot-resource branch 3 times, most recently from 93abcb6 to 8d2c57d Compare November 28, 2017 22:55
@wallrj wallrj changed the title WIP: Create Cassandra Pilot resources Create Cassandra Pilot resources Nov 28, 2017
@wallrj wallrj force-pushed the 152-cassandra-pilot-resource branch from 8d2c57d to fd4494e Compare November 30, 2017 15:53
@wallrj wallrj force-pushed the 152-cassandra-pilot-resource branch from fd4494e to faf367c Compare December 13, 2017 17:44
@wallrj wallrj changed the base branch from 23-cassandra to master December 13, 2017 17:44
@wallrj
Copy link
Member Author

wallrj commented Dec 14, 2017

/test e2e

@wallrj
Copy link
Member Author

wallrj commented Dec 14, 2017

I've updated this PR against master.
There are two other PRs stacked on top of this (#142, #162)
But I've marked those as WIP to avoid confusion.
Ready for review.
/cc @munnerz

@wallrj
Copy link
Member Author

wallrj commented Dec 14, 2017

/test e2e

Copy link
Contributor

@munnerz munnerz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - only a few changes to make!

}
}
}
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm not too sure if we should be doing this right now (deleting 'unused' pilot resources). It's difficult to define unused, and in fact, the lack of a pod does not immediately imply that the Pilot resource is no longer in use. I'd rather do something along the lines of waiting N seconds since the last heartbeat (provided no corresponding pod exists) before deleting.

Right now the Elasticsearch controller doesn't actually clean up old Pilots, it just cleans up old StatefulSets (i.e. ones no longer specified as a nodepool on the ElasticsearchCluster resource). Should we make these the same? Consistency in this sort of thing probably makes it easier to reason about. At some point I think we can refactor the Pilot control loop to make it mostly generic between DB types.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm not too sure if we should be doing this right now (deleting 'unused' pilot resources). It's difficult to define unused, and in fact, the lack of a pod does not immediately imply that the Pilot resource is no longer in use. I'd rather do something along the lines of waiting N seconds since the last heartbeat (provided no corresponding pod exists) before deleting.

But what's the bug that the heartbeat / delay would fix?
If there is a bug, then I can add the heartbeat / delay in a followup branch.
But for now, removing unused Pilot resources seems like good housekeeping.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The heartbeat is a suggestion for how it can be handled, but as you say, should be in a follow up.

I think that we can't say the lack of pod = the pilot doesn't exist. In the ES controller for example, the 'master' Pilot uses Pilot resources to determine whether a cluster is in a state ready to scale down for example. This scenario, as far as I can tell, will cause failures if we auto-delete Pilots if no Pod exists:

  • User requests scale down of cluster
  • Navigator updates a Pilot resource to 'decommission' it
  • Because the user is a sadist, or because something happened, the pod that is being decommissioned gets deleted.
  • This should cause the StatefulSet controller to re-create the pod, really, as the Pilot is not finished decommissioning so it needs to come back in order to finish
  • However, because we immediately deleted the Pilot resource upon the Pod disappearing, the 'knowledge' of those remaining documents/indices left on that node (i.e. the pilot.status.elasticsearch.documentCount field) has been lost.
  • So now our master Pilot is not aware of those extra documents, nor the fact that a Pilot is being decommissioned either (as that knowledge has also been lost).
  • At this point we've reached a bit of a weird state - especially if we consider that the only state about the state of the world is stored in the k8s api and anything in-memory should not be relied upon

I do agree we need to do house keeping, but I'm thinking we should consider how this is done carefully as 'unused' is a tricky term. Additionally, given the Pilot resource in the API is one of the users primary points for debugging issues with their clusters, by deleting the Pilots so frequently we are also destroying a lot of useful debugging information for users. I'd much prefer the behaviour between all DB types to be consistent - the first steps in debugging the ES cluster should be the same as Cassandra, and vice versa.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ my only outstanding/current issue with this PR before I'm happy to merge 😄 .

Let me know if you disagree, but any change we make here needs to be consistent with ES too given this touches a user-facing resource (the Pilot resource)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that in the scenario above,

  • Accidentally deleted pod gets restarted by StatefulSet Controller.
  • Pilot process waits for appearance of Pilot
  • Pilot starts ES sub-process
  • Pilot re-updates the Pilot status with document count 0
  • navigator controller waits for document count 0 and then decrements the Statefulset replicas count
  • and removes the drained pod and corresponding pilot.

But I agree that we should keep ES and Cass in sync, so I'll remove the Pilot cleanup code for now.

}
desiredPilot = existingPilot.DeepCopy()
updatePilotForCluster(cluster, pod, desiredPilot)
_, err = client.Update(desiredPilot)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So from what I can see here, this will cause Navigator to go into a sync loop, continuously updating the Pilot resource. We don't ever check if the Pilot is already reconciled (e.g. it's spec is up to date), so we end out updating to contain the same spec as is already there. This will still however increment the ResourceVersion, thus causing this loop to be triggered once again (and so on).

There are a couple of ways (that I can think of) to deal with this:

  • Perform a reflect.DeepEqual on the pilot.Spec before calling Update. This has the downside of being inefficient (DeepEqual is an expensive call), however it is accurate and easy to do.

  • Write a hash of the spec into the annotations of the Pilot, so that we can quickly compare the old and new hashes.

  • Manually compare each field, but this is tedious and error prone.

Can you think of anything else here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, we need to update the strategy.go file for Cassandra resources to not perform updates to the cassandracluster.status block when plain Update is called (and instead, UpdateStatus should be used if the Status block needs updating).

Right now, each time Update is called we will wipe out all fields in Status, causing Pilots and Navigator to fight with each other indefinitely (causing more loops)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write a hash of the spec into the annotations of the Pilot, so that we can quickly compare the old and new hashes.

I've gone with that option. I found the ES hashing code and adapted that. Also added labels to the hash since the controller updates those too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, each time Update is called we will wipe out all fields in Status, causing Pilots and Navigator to fight with each other indefinitely (causing more loops)

I don't think that's the case. I'm taking the latest Pilot from the lister and then updating labels and in future, spec Not replacing Pilot.Status.

appslisters "k8s.io/client-go/listers/apps/v1beta1"
)

func PodControlledByCluster(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also update the Elasticsearch controller to use this function too? This version doesn't depend on any types like *ElasticsearchCluster like the one the the controller currently uses

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

pilot := &v1alpha1.Pilot{}
ownerRefs := pilot.GetOwnerReferences()
ownerRefs = append(ownerRefs, util.NewControllerRef(cluster))
pilot.SetOwnerReferences(ownerRefs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need so many steps to do this bit - we aren't mutating an existing Pilot, but creating a new one instead (so don't need to use function accessors)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

) *v1alpha1.Pilot {
pilot.SetName(pod.GetName())
pilot.SetNamespace(cluster.GetNamespace())
pilot.SetLabels(util.ClusterLabels(cluster))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be careful not to override any user provided labels, or labels that may have been added by the pilot (we don't do this right now, but we might do at some point)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

* For every pod in the nodepool, create a corresponding Pilot resource.
* Ignore Pods that are not owned by the cluster.
* Stop and error if we encounter Pilots with expected name, but not owned by the cluster.

Fixes: jetstack#152
@wallrj wallrj force-pushed the 152-cassandra-pilot-resource branch from a619d42 to a4c6d7a Compare January 9, 2018 10:54
@wallrj
Copy link
Member Author

wallrj commented Jan 9, 2018

/test e2e

@munnerz
Copy link
Contributor

munnerz commented Jan 9, 2018

/retest

@wallrj
Copy link
Member Author

wallrj commented Jan 9, 2018

/retest

1 similar comment
@munnerz
Copy link
Contributor

munnerz commented Jan 9, 2018

/retest

@munnerz
Copy link
Contributor

munnerz commented Jan 9, 2018

I restarted the build infra workers

/retest

@munnerz
Copy link
Contributor

munnerz commented Jan 9, 2018

/retest

p.ObjectMeta,
p.Labels,
}
hasher := fnv.New32()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be pulled out to be a global var (to save calling New32 each time)?

Happy to be a follow up as I'll be making similar changes in ES

c.statefulSets,
)
if err != nil {
return clusterPods, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably return nil, err here instead of a partial list of pods.

}
err = util.OwnerCheck(existingPilot, cluster)
if err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: should there be some way to detect that this function has failed because the existing pilot is owned by another cluster? (not important for merge)

@munnerz
Copy link
Contributor

munnerz commented Jan 9, 2018

/lgtm
/approve

@jetstack-ci-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: munnerz

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@jetstack-ci-bot
Copy link
Contributor

/test all [submit-queue is verifying that this PR is safe to merge]

@jetstack-ci-bot
Copy link
Contributor

Automatic merge from submit-queue.

@jetstack-ci-bot jetstack-ci-bot merged commit 0e6f38c into jetstack:master Jan 9, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants