containerd is not upgraded because it is not restarted #9019

snowmansora · 2022-06-21T18:09:40Z

Environment:

Version of Ansible (ansible --version):
ansible==4.10.0
ansible-core==2.11.12
Version of Python (python --version):
2.7.5

Kubespray version (commit) (git rev-parse --short HEAD):
v2.19.0

Command used to invoke ansible:
ansible-playbook upgrade-cluster.yml

Anything else do we need to know:
On a cluster deployed using kubespray-v2.18.0, after running ansible-playbook upgrade-cluster.yml using kubespray-v2.19.0, containerd is still using old version as shown in kubectl get nodes -o wide. After restarting the containerd service, new version is used.

The problem is that the handler notify: restart containerd is never ran, despite that is notified multiple times in https://github.com/kubernetes-sigs/kubespray/blob/v2.19.0/roles/container-engine/containerd/tasks/main.yml.

After some troubleshooting, the problem should be caused by the import_role in https://github.com/kubernetes-sigs/kubespray/blob/v2.19.0/roles/container-engine/validate-container-engine/tasks/main.yml#L90, because after I updated it to include_role, the problem is gone.

My GUESS is that the when conditions are added to the containerd handlers when import_role in https://github.com/kubernetes-sigs/kubespray/blob/v2.19.0/roles/container-engine/validate-container-engine/tasks/main.yml#L90, and since my upgrade doesn't match those conditions (e.g. container_manager != "containerd"), the handler is not ran... What a confusing ansible behaviour...

Also, if you just run ansible-playbook cluster.yml on a fresh node, you will see the handler is never ran as well, but containerd will still be running because of https://github.com/kubernetes-sigs/kubespray/blob/v2.19.0/roles/container-engine/containerd/tasks/main.yml#L100. I tried that with ansible==5.7.1 ansible-core==2.12.6 python 3.10, and the problem happens as well. So the problem doesn't look like an old ansible behaviour.

The text was updated successfully, but these errors were encountered:

cloud-66 · 2022-08-01T17:58:56Z

i also had this problem today. Even then i change containerd preferences and run cluster.yml --tags=containerd , config changes on nodes but containerd wasnt been restarted. I have to restart it manually to take effect of new options in config.

fungusakafungus · 2022-09-23T15:59:31Z

Can confirm by using debugger: always on handler, handler is skipped because it was loaded before with a false condition:

RUNNING HANDLER [container-engine/containerd : restart containerd] *************************************************************************************************************************************************************************************************************
Friday 23 September 2022  16:42:37 +0200 (0:00:00.181)       0:01:18.666 ******
skipping: [node-2.company.com]
[node-2.company.com] HANDLER: container-engine/containerd : restart containerd (debug)> p task.when
['not (is_ostree or (ansible_distribution == "Flatcar Container Linux by '
 'Kinvolk") or (ansible_distribution == "Flatcar"))',
 'container_manager != "containerd"',
 'docker_installed.matched == 0',
 'containerd_installed.matched > 0',
 "ansible_facts.services[service_name]['state'] == 'running'"]
[node-2.company.com] HANDLER: container-engine/containerd : restart containerd (debug)> User interrupted execution

rptaylor · 2022-09-23T17:48:32Z

My GUESS is that the when conditions are added to the containerd handlers when import_role

That is exactly how import_* works. (See also #9279 , https://serverfault.com/questions/875247/whats-the-difference-between-include-tasks-and-import-tasks )

since my upgrade doesn't match those conditions (e.g. container_manager != "containerd"

That part seems less clear, why is container_manager != "containerd" ? Since #8439 was added it looks like this would cause containerd to be uninstalled if container_manager != "containerd".

In any case the last handler of a given name that is invoked should override earlier ones, so the handlers invoked normally in https://github.com/kubernetes-sigs/kubespray/blob/v2.19.0/roles/container-engine/containerd/tasks/main.yml should override this one: https://github.com/kubernetes-sigs/kubespray/blob/v2.19.0/roles/container-engine/validate-container-engine/tasks/main.yml#L90
unless there is an ansible bug...

rptaylor · 2022-09-23T18:28:26Z

It's also slightly odd that the restart containerd handler is a no-op that just calls another handler:
https://github.com/kubernetes-sigs/kubespray/blob/master/roles/container-engine/containerd/handlers/main.yml
Probably the other handlers should listen instead.

It should work though but might be part of the corner case that could explain why this might be hitting some obscure bug (I looked through ansible issues but didn't find one that seemed pertinent to this).

rptaylor · 2022-09-26T17:56:48Z

"the location where it is inserted doesn't affect when the handlers are added"
ansible/ansible#78871 (comment)

snowmansora added the kind/bug Categorizes issue or PR as related to a bug. label Jun 21, 2022

fungusakafungus mentioned this issue Sep 23, 2022

Release Proposal v2.20 #9291

Closed

fungusakafungus mentioned this issue Sep 23, 2022

Hotfix containerd restart #9322

Merged

k8s-ci-robot closed this as completed in #9322 Sep 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

containerd is not upgraded because it is not restarted #9019

containerd is not upgraded because it is not restarted #9019

snowmansora commented Jun 21, 2022

cloud-66 commented Aug 1, 2022

fungusakafungus commented Sep 23, 2022

rptaylor commented Sep 23, 2022 •

edited

Loading

rptaylor commented Sep 23, 2022

rptaylor commented Sep 26, 2022

containerd is not upgraded because it is not restarted #9019

containerd is not upgraded because it is not restarted #9019

Comments

snowmansora commented Jun 21, 2022

cloud-66 commented Aug 1, 2022

fungusakafungus commented Sep 23, 2022

rptaylor commented Sep 23, 2022 • edited Loading

rptaylor commented Sep 23, 2022

rptaylor commented Sep 26, 2022

rptaylor commented Sep 23, 2022 •

edited

Loading