Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with the creation of the calico.vxlan interface #8228

Closed
erickuiper opened this issue Nov 23, 2021 · 4 comments
Closed

Issue with the creation of the calico.vxlan interface #8228

erickuiper opened this issue Nov 23, 2021 · 4 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@erickuiper
Copy link

Environment:

  • Cloud provider or hardware configuration:
    On Prem - KVM

  • OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):
    Linux 3.10.0-1160.45.1.el7.x86_64 x86_64
    NAME="CentOS Linux"
    VERSION="7 (Core)"
    ID="centos"
    ID_LIKE="rhel fedora"
    VERSION_ID="7"
    PRETTY_NAME="CentOS Linux 7 (Core)"
    ANSI_COLOR="0;31"
    CPE_NAME="cpe:/o:centos:centos:7"
    HOME_URL="https://www.centos.org/"
    BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

  • Version of Ansible (ansible --version):
    ansible 2.9.20

  • Version of Python (python --version):
    python version = 3.8.10

Kubespray version (commit) (git rev-parse --short HEAD):
v2.17.0

Anything else do we need to know:
We have run into an intermittent issue with Calico and Kubespray where the ansible run to deploy a cluster on new machines was successful. But workloads are not able to communicate as nodelocaldns is unable to communicate with coredns. I was able to trace this connectivity issue to a routing issue, as I saw that traffic to 10.233/18 was routed through my default gateway. When comparing my routing table to other environments i noticed that the calico.vxlan interface was not created and thus not possible to route traffic using these interfaces.

With the help of the Calico community I identified that Kubespray doesn't setup the calico daemonset with the environment variable FELIX_VXLANENABLED=true. This setting enables the creation of this interface. I manually updated the daemonset with this environment value, the interface was created and all problems we're resolved.

Problems is that this issue occurs only once a few deployments. Calico documentation states that this value is set automatically, but this only happens if the IPPool CRD got updated by kubespray, then that automatic enablement will have been bypassed. But somehow this does not work in some cases. So I would like to ask to include this setting by default if VXLan is enabled on Calico.

@erickuiper erickuiper added the kind/bug Categorizes issue or PR as related to a bug. label Nov 23, 2021
@cristicalin
Copy link
Contributor

@ekuiper-sbp we set the Calico vxlanEnabled via FelixConfiguration here: https://github.com/kubernetes-sigs/kubespray/blob/master/roles/network_plugin/calico/tasks/install.yml#L163, this has been added post 2.17 release unfortunately.

You can cherry-pick the specific change from master branch (see #8167).

/cc @devinjeon, since you made the original patch would you mind doing a back port to release-2.17 branch ?

/cc @floryut , one more candidate for a new minor tag on 2.17

@devinjeon
Copy link
Contributor

devinjeon commented Nov 28, 2021

@cristicalin Opened a new PR #8240 cherry-picking #8167 to release-2.17 branch.

@oomichi
Copy link
Contributor

oomichi commented Dec 1, 2021

By the above effort, I think we have fixed this issue on both master and release-2.17 branch.
@ekuiper-sbp Please close this issue if confirming the issue has been solved.

@erickuiper
Copy link
Author

Thanks for resolving and merging this into the 2.17

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

4 participants