-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nodes unable to connect to services whose pods are scheduled on other nodes #1266
Comments
This seems to be related to an issue in VXLAN |
friendly up Has someone figured out a workaround yet ? |
Don't use vxlan until the upstream issues are resolved? |
Definitely, I switched to |
@malikbenkirane by any chance, is it possible for you to share manifests/steps for setting up calico? |
After recent patching I found my cluster to be very unstable (pods not seeing DNS etc.) Hopefully my 2+ days worth of troubleshooting can help someone here - if anyone has an idea for diagnostics I could run, please let me know: I wrote a script to run an nslookup (to google.com) and curl to another pod and service (by IP) (running kubectl exec against pods in a daemonset). Here are my findings: running v1.17.7+k3s1 on AWS AMI-2, flannel/cni, no firewalld Initial script run (all fine, DNS resolves, service IP pingable, pod IP pingable):
reboot worker8245d and immediately ran my ping script:
yep, whole cluster not happy. after a few mins:
No matter how long I waited, worker8245d did not recover until.... restarted k3s on worker8245d, waited a few seconds and ran again - back to normal
Edit : This seems to be a problem on AWS - Tried with Redhat 8 and also combination of v1.18.4+k3s1 and get the same behaviour - if a node reboots I have to post-reboot log in and restart k3s-agent. Don't see the same on Hetzner cloud using CentOs 8 |
This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions. |
Version:
k3s version v0.9.0 (65d8764)
to
k3s version v1.17.0+k3s.1 (0f64465)
Describe the bug
Since v0.9.0, nodes and pods with
hostNetwork: true
have been unable to connect to services whose selected pods are scheduled on other nodes. v0.8.1 is unaffected.To Reproduce
hostNetwork: true
, attempt to connect to a service IP (e.g. kube-dns)Expected behavior
Actual behavior
Additional context
kubectl get nodes
kubectl get svc -A
kubectl get pods -A -o wide
The text was updated successfully, but these errors were encountered: