Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Calico install-cni crashloop on Pod restarts #724

Merged
merged 1 commit into from
May 9, 2020

Conversation

dghubble
Copy link
Member

@dghubble dghubble commented May 9, 2020

  • Set a consistent MCS level/range for Calico install-cni
  • Note: Rebooting a node was a workaround, because Kubelet relabels /etc/kubernetes(/cni/net.d)

Background:

  • On SELinux enforcing systems, the Calico CNI install-cni container ran with default SELinux context and a random MCS pair. install-cni places CNI configs by first creating a temporary file and then moving them into place, which means the file MCS categories depend on the containers SELinux context.
  • calico-node Pod restarts creates a new install-cni container with a different MCS pair that cannot access the earlier written file (it places configs every time), causing the init container to error and calico-node to crash loop
  • https://github.com/projectcalico/cni-plugin/issues/874
mv: inter-device move failed: '/calico.conf.tmp' to
'/host/etc/cni/net.d/10-calico.conflist'; unable to remove target:
Permission denied
Failed to mv files. This may be caused by selinux configuration on
the
host, or something else.

Note, this isn't a host SELinux configuration issue.

Related:

@dghubble dghubble changed the title Fix Calico node crash loop on Pod restart Fix Calico install-cni crashloop on Pod restarts May 9, 2020
* Set a consistent MCS level/range for Calico install-cni
* Note: Rebooting a node was a workaround, because Kubelet
relabels /etc/kubernetes(/cni/net.d)

Background:

* On SELinux enforcing systems, the Calico CNI install-cni
container ran with default SELinux context and a random MCS
pair. install-cni places CNI configs by first creating a
temporary file and then moving them into place, which means
the file MCS categories depend on the containers SELinux
context.
* calico-node Pod restarts creates a new install-cni container
with a different MCS pair that cannot access the earlier
written file (it places configs every time), causing the
init container to error and calico-node to crash loop
* https://github.com/projectcalico/cni-plugin/issues/874

```
mv: inter-device move failed: '/calico.conf.tmp' to
'/host/etc/cni/net.d/10-calico.conflist'; unable to remove target:
Permission denied
Failed to mv files. This may be caused by selinux configuration on
the
host, or something else.
```

Note, this isn't a host SELinux configuration issue.

Related:

* poseidon/terraform-render-bootstrap#186
@dghubble dghubble force-pushed the calico-install-cni-selinux branch from 0fcddbb to 358854e Compare May 9, 2020 23:02
@dghubble dghubble merged commit 358854e into master May 9, 2020
@dghubble dghubble deleted the calico-install-cni-selinux branch May 10, 2020 00:38
dghubble added a commit that referenced this pull request Nov 25, 2020
* Enable Calico MTU auto-detection
* Remove [workaround](#724) to
Calico cni-plugin [issue](https://github.com/projectcalico/cni-plugin/issues/874)

Rel: poseidon/terraform-render-bootstrap#230
dghubble added a commit that referenced this pull request Nov 25, 2020
* Enable Calico MTU auto-detection
* Remove [workaround](#724) to
Calico cni-plugin [issue](https://github.com/projectcalico/cni-plugin/issues/874)

Rel: poseidon/terraform-render-bootstrap#230
dghubble-robot pushed a commit to poseidon/terraform-aws-kubernetes that referenced this pull request Nov 30, 2020
dghubble-robot pushed a commit to poseidon/terraform-digitalocean-kubernetes that referenced this pull request Nov 30, 2020
dghubble-robot pushed a commit to poseidon/terraform-azure-kubernetes that referenced this pull request Nov 30, 2020
dghubble-robot pushed a commit to poseidon/terraform-onprem-kubernetes that referenced this pull request Nov 30, 2020
dghubble-robot pushed a commit to poseidon/terraform-google-kubernetes that referenced this pull request Nov 30, 2020
Snaipe pushed a commit to aristanetworks/monsoon that referenced this pull request Apr 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant