Kubernetes v1.12.x doesn't restore pod-checkpointer #1001

dghubble · 2018-10-10T08:39:04Z

Kubernetes v1.12.x doesn't currently work with the pod-checkpointer. In my exploration so far, bootstrapping a v1.12.1 cluster succeeds (workaround one known issue) and the pod-checkpointer checkpoints itself to /etc/kubernetes/manifests and moves the apiserver checkpoint to inactive. Normal so far.

For sanity sake, the following work alright as well:

Deleting the checkpointed manifest, it gets restored by the running pod checkpointer
Deleting the checkpoint pod, it gets recreated (I believe from the manifest)

In the past, the "checkpoint" meant there was a 2nd pod running in a typical cluster.

kube-system          pod-checkpointer-2kflw                                               (from DaemonSet)                                  
kube-system          pod-checkpointer-2kflw-some-controller-node-name  (Pod with "checkpoint-of")

Starting in v1.12, only pod-checkpointer-2kflw exists. With verbosity turned up, the Kubelet on the controller continuously reports that:

Static pod "1b54a6b84faeeb51d981ca0b8930e18d" (pod-checkpointer-2kflw-some-controller-node-name/kube-system) does not have a corresponding mirror pod; skipping

This becomes a serious issue when power cycling the cluster. The Kubelet starts, reads static manifests from /etc/kubernetes/manifests (containing the checkpointed pod-checkpointer), and logs that its skipping creating the pod-checkpointer. As a result, the cluster does not return.

Static pod "1b54a6b84faeeb51d981ca0b8930e18d" (pod-checkpointer-2kflw-some-controller-node-name/kube-system) does not have a corresponding mirror pod; skipping

I'm still hunting for the upstream commit that may have altered handling for static/mirror pods.

The text was updated successfully, but these errors were encountered:

rphillips · 2018-10-10T15:55:07Z

Hi Dalton! The following issue and PR might be relevant to this issue: kubernetes/kubernetes#69346 kubernetes/kubernetes#69566.

Are you using a kubelet < 1.11 ?

dghubble · 2018-10-10T16:18:56Z

The Kubelet matches the control plane version in my clusters.

dghubble · 2018-10-10T16:22:35Z

Iterating through the v1.12 pre-releases, it seems this started happening between v1.12.0-beta.2 and v1.12.0-rc.1 (comparison). v1.12.0-beta.2 doesn't bootstrap (due to various Kubernetes bugs), but it gets far enough to show the pod-checkpointer's checkpoint pod gets created (i.e. there are two pods). In v1.12.0-rc.1, the 2nd pod is not created and the Kubelet shows the error message I posted above, about "does not have a corresponding mirror pod".

rel:
https://github.com/poseidon/terraform-render-bootkube/branches
https://github.com/poseidon/typhoon/branches

That's as far as I've made it so far. It might be beneficial to post a PR to bootkube attempting to bump to v1.12.1 to confirm others can repro the original issue. And then I suspect something within those 88 commits upstream.

dghubble · 2018-10-11T14:05:29Z

#1003 may be a better way to repro. No need to involve CoreDNS changes and re-vendoring (like #1002) when we want to discover the breakage.

rphillips · 2018-10-11T14:16:27Z

Agreed... I am getting:

predicate.go:133] Predicate failed on Pod: pod-checkpointer-l9cgg-172.17.4.101_kube-system(231cd4bc9c8f63b3131c1ec25716fe91), for reason: Predicate MatchNodeSelector failed

which looks like this kubernetes/kubernetes#65153 upstream issue.

rphillips · 2018-10-11T14:29:13Z

Removing the nodeselector statements from both the checkpointer and apiserver checkpoint files restores the pods correctly.

dghubble · 2018-10-11T14:32:37Z

I see that as well if I delete the DaemonSet pod-checkpointer. The checkpointed pod can't schedule. Its a great tip, its easier to see what's going on doing this from a running cluster (rather than after power cycling). Comparing actual checkpointed pod manifests btw a v1.11.3 cluster and a v1.12.1 cluster, I see a difference.

# Kubernetes v1.11.3
$ cat kube-system-pod-checkpointer-2kflw.json | jq . | grep affinity
Nothing here

# Kubernetes v1.12.1
cat kube-system-pod-checkpointer-bxk2m.json | jq . | grep affinity
There is an affinity block

"affinity": {
      "nodeAffinity": {
        "requiredDuringSchedulingIgnoredDuringExecution": {
          "nodeSelectorTerms": [
            {
              "matchExpressions": null
            }
          ]
        }
      }
    },

Maybe related to kubernetes/kubernetes#68173 which was not in v1.12.0-beta.2 and first in v1.12.0-rc.1. Although I'm not sure how affinity applies during early bootstrapping.

dghubble · 2018-10-11T15:50:53Z

I suppose it is unusual checkpoints have a node selector or affinity at all since they're pod on disk and should always run on that node. But looking at the checkpoint manifest in v1.11.3, those also had a nodeSelector and cluster power cycles work without issue. So I don't understand the report in kubernetes/kubernetes#65153.

I tried a similar experiment to yours, launching a v1.12.1 cluster, power cycling it, and but then modifying the pod-checkpointer and apiserver checkpoint files to remove the affinity section. The cluster recovered. And the affinity section wasn't in checkpoint files prior.

Of course, as soon as the cluster recovers, the pod-checkpointer overwrites the checkpoint file to include an affinity again. So only one of the two pods it running and I'd expect the same issue on the next power cycle.

Perhaps pod-checkpointer should strip the affinity from the manifest before writing to disk?

rphillips · 2018-10-11T15:54:43Z

I am thinking the affinity should be set to nil if matchExpressions == nil.

dghubble · 2018-10-11T17:27:28Z

Sound reasonable to me.

I wonder if pod-checkpointer even supports checkpointing pods that have an affinity at all (pod-checkpointer and apiserver don't have one). Maybe we should also document that to use pod-checkpointer, a pod manifest needs to have the checkpointer.alpha.coreos.com/checkpoint=true annotation and should not have any affinities in docs.

* Mount an empty dir for the controller-manager to work around kubernetes/kubernetes#68973 * Use a patched pod-checkpointer that strips affinity from checkpointed pod manifests. Kubernetes v1.12.0-rc.1 introduced a default affinity that appears on checkpointed manifests; but it prevented scheduling and checkpointed pods should not have an affinity, they're run directly by the Kubelet on the local node * kubernetes-retired/bootkube#1001 * kubernetes/kubernetes#68173

* Mount an empty dir for the controller-manager to work around kubernetes/kubernetes#68973 * Update coreos/pod-checkpointer to strips affinity from checkpointed pod manifests. Kubernetes v1.12.0-rc.1 introduced a default affinity that appears on checkpointed manifests; but it prevented scheduling and checkpointed pods should not have an affinity, they're run directly by the Kubelet on the local node * kubernetes-retired/bootkube#1001 * kubernetes/kubernetes#68173

* Mount an empty dir for the controller-manager to work around kubernetes/kubernetes#68973 * Update coreos/pod-checkpointer to strip affinity from checkpointed pod manifests. Kubernetes v1.12.0-rc.1 introduced a default affinity that appears on checkpointed manifests; but it prevented scheduling and checkpointed pods should not have an affinity, they're run directly by the Kubelet on the local node * kubernetes-retired/bootkube#1001 * kubernetes/kubernetes#68173

dghubble · 2018-10-17T04:12:03Z

The issue with the pod-checkpointer was closed by #1009. Thanks @rphillips! The new image is quay.io/coreos/pod-checkpointer:018007e77ccd61e8e59b7e15d7fc5e318a5a2682.

It can be used with v1.12 or prior versions too, not really tied to v1.12. I'm closing since actually upgrading to v1.12 is separate and is continuing in #1003

* Mount an empty dir for the controller-manager to work around kubernetes/kubernetes#68973 * Update coreos/pod-checkpointer to strip affinity from checkpointed pod manifests. Kubernetes v1.12.0-rc.1 introduced a default affinity that appears on checkpointed manifests; but it prevented scheduling and checkpointed pods should not have an affinity, they're run directly by the Kubelet on the local node * kubernetes-retired/bootkube#1001 * kubernetes/kubernetes#68173

dghubble mentioned this issue Oct 10, 2018

Update Kubernetes from v1.11.3 to v1.12.x poseidon/typhoon#301

Merged

rphillips mentioned this issue Oct 11, 2018

WIP: Bump v1.12.1 #1002

Closed

rphillips mentioned this issue Oct 11, 2018

Fix PodAntiAffinity issues in case of multiple affinityTerms kubernetes/kubernetes#68173

Merged

rphillips mentioned this issue Oct 11, 2018

checkpointer: ignore Affinity within podspec #1004

Closed

dghubble mentioned this issue Oct 14, 2018

Update Kubernetes from v1.11.3 to v1.12.x poseidon/terraform-render-bootstrap#77

Merged

dghubble mentioned this issue Oct 15, 2018

Kubelet does not launch static pods while waiting for bootstrapping kubernetes/kubernetes#68686

Closed

dghubble closed this as completed Oct 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubernetes v1.12.x doesn't restore pod-checkpointer #1001

Kubernetes v1.12.x doesn't restore pod-checkpointer #1001

dghubble commented Oct 10, 2018

rphillips commented Oct 10, 2018

dghubble commented Oct 10, 2018

dghubble commented Oct 10, 2018 •

edited

Loading

dghubble commented Oct 11, 2018

rphillips commented Oct 11, 2018

rphillips commented Oct 11, 2018

dghubble commented Oct 11, 2018 •

edited

Loading

dghubble commented Oct 11, 2018 •

edited

Loading

rphillips commented Oct 11, 2018

dghubble commented Oct 11, 2018

dghubble commented Oct 17, 2018 •

edited

Loading

Kubernetes v1.12.x doesn't restore pod-checkpointer #1001

Kubernetes v1.12.x doesn't restore pod-checkpointer #1001

Comments

dghubble commented Oct 10, 2018

rphillips commented Oct 10, 2018

dghubble commented Oct 10, 2018

dghubble commented Oct 10, 2018 • edited Loading

dghubble commented Oct 11, 2018

rphillips commented Oct 11, 2018

rphillips commented Oct 11, 2018

dghubble commented Oct 11, 2018 • edited Loading

dghubble commented Oct 11, 2018 • edited Loading

rphillips commented Oct 11, 2018

dghubble commented Oct 11, 2018

dghubble commented Oct 17, 2018 • edited Loading

dghubble commented Oct 10, 2018 •

edited

Loading

dghubble commented Oct 11, 2018 •

edited

Loading

dghubble commented Oct 11, 2018 •

edited

Loading

dghubble commented Oct 17, 2018 •

edited

Loading