Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete pods if StatefulSet object is updated #307

Merged
merged 2 commits into from
Jul 9, 2018

Conversation

indradhanush
Copy link
Contributor

A supervisor bug prevents pods from successfully completing a leader
election when an update is made to the StatefulSet object. This arises
from the RollingUpdate behaviour of StatefulSet.

However, deleting them all at once fixes the problem. In this commit
we compare the older StatefulSet object with the updated one and
delete all pods that belong to that object if the StatefulSet objects
before and after the update are different. This check was necessary
because the StatefulSet keeps getting updated even though nothing has
changed as a result of the ConfigMap cache getting updated frequently.

More about the bug here: habitat-sh/habitat#5264

Copy link
Contributor

@krnowak krnowak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Questions mostly.

@@ -748,3 +766,18 @@ func (hc *HabitatController) findConfigMapInCache(cm *apiv1.ConfigMap) (*apiv1.C

return obj.(*apiv1.ConfigMap), nil
}

func (hc *HabitatController) deleteStatefulSetPods(hab *habv1beta1.Habitat, sts *v1beta2.StatefulSet) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sts parameter seems to be unused here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should we set the update strategy back to ondelete to avoid messing with the statefulset controller while it is busy deleting the pods?

func (hc *HabitatController) deleteStatefulSetPods(hab *habv1beta1.Habitat, sts *v1beta2.StatefulSet) error {
fs := fields.SelectorFromSet(fields.Set(map[string]string{
habv1beta1.HabitatLabel: "true",
habv1beta1.HabitatNameLabel: hab.Name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, don't pods have any statefulset specific label to use in matching? I'm wondering if this is not going to be too wide. For example have three separate manifests specifying redis to be run and update just one of them. This selector will likely pick up all the pods from all three manifests, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, could you test if just updating a size in the persistent storage causes all the pods to be deleted at once? Not sure if the persistent storage size change will be reflected in the pod template…

Copy link
Contributor Author

@indradhanush indradhanush Jul 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, don't pods have any statefulset specific label to use in matching?

No labels at the moment. The pods have a reference to the StatefulSet in the OwnerReference field though. But I was unable to use them as part of the FieldSelector query. It appears they're not supported at the moment. We could add labels while creating the StatefulSet though.

I can use the OwnerReference but that would mean manually iterating the list of pods and checking for that field. In that case, using DeleteCollection won't work as well and I'd have to use Delete in a for loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, could you test if just updating a size in the persistent storage causes all the pods to be deleted at once? Not sure if the persistent storage size change will be reflected in the pod template…

Already discussed on Slack, but commenting for posterity's sake. Updating size of persistent storage is prohibited in StatefulSet objects and would require deleting and recreating it instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I had a deeper look at the habv1beta1.HabitatNameLabel: hab.Name selector you are using. I thought that hab.Name something like redis or nginx, that is - a name that is used to construct a path to user.toml file (/hab/user/redis/config/user.toml), so certainly something that is not unique. I was wrong, hab.Spec.V1beta2.Service.Name is used for this purpose. And hab.Name needs to be a unique name of a habitat resource. hab.Name also becomes a unique name of the generated statefulset and hab.Name is put into pod template as habv1beta1.HabitatNameLabel, so it is not possible for two pods with the same value of the habv1beta1.HabitatNameLabel label to belong to two differently named statefulsets. So the selector you used is correct.

Also I had a look at statefulsets, because I was wondering how does a statefulset know which pods belong to it if there is no statefulset-specific label in the pod to describe this relationship. For that there is a thing in statefulset named podselector, which the statefulset can use to get all its pods. The selector is available under .Spec.Selector in statefulset. And it almost the same as the one you used. So I'd suggest passing the statefulset to the deleteStatefulSetPods function and using sts.Spec.Selector in the LabelSelector field in listOptions variable. The sts.Spec.Selector is a pointer, but there's no need to check for nil - this field must be specified in k8s >= 1.8 or the validation would fail. See: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#pod-selector

@indradhanush indradhanush force-pushed the indradhanush/delete-pods-on-sts-change branch from 96b88b1 to bb6631a Compare July 9, 2018 12:16
A supervisor bug prevents pods from successfully completing a leader
election when an update is made to the StatefulSet object. This arises
from the RollingUpdate behaviour of StatefulSet.

However, deleting them all at once fixes the problem. In this commit
we compare the older StatefulSet object with the updated one and
delete all pods that belong to that object if the StatefulSet objects
before and after the update are different. This check was necessary
because the StatefulSet keeps getting updated even though nothing has
changed as a result of the ConfigMap cache getting updated frequently.

More about the bug here: habitat-sh/habitat#5264

Signed-off-by: Indradhanush Gupta <indra@kinvolk.io>
Signed-off-by: Indradhanush Gupta <indra@kinvolk.io>
@indradhanush indradhanush force-pushed the indradhanush/delete-pods-on-sts-change branch from bb6631a to a202b1b Compare July 9, 2018 12:25
@krnowak
Copy link
Contributor

krnowak commented Jul 9, 2018

LFAD.

@krnowak krnowak merged commit d23a110 into master Jul 9, 2018
@indradhanush indradhanush deleted the indradhanush/delete-pods-on-sts-change branch July 9, 2018 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants