problem encountered while using HostNetworkDNSPolicy #92276

david-enli · 2020-06-18T23:08:56Z

Hi all, first time posting and apologize if doing it wrong. With the new release of k8s 1.16.10 , our pure storage CSI node plugin (pods) for iscsi deployed by DaemonSet start to have issue to talk to K8S service..

We deploy CSI node plugin pod on each master and worker nodes. However, it looks like network routing on worker nodes is problematic.
It looks like master have correct routing to our internal k8s service:

# nslookup pso-db-public.nstk
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name:      pso-db-public.nstk
Address 1: 10.102.98.125 pso-db-public.nstk.svc.cluster.local

but the worker nodes s the problem:

/ # nslookup pso-db-public.nstk
Server:    10.96.0.10

Here is our Demonset pod network and DNS setting:

hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet

Since first discovered this issue, we've been hitting it in 1.17.1, 1.17.5, 1.17.6. Still trying out more versions but it looks like this wasn't fixed since 1.16.10, thank you very much for your help.

The text was updated successfully, but these errors were encountered:

david-enli · 2020-06-18T23:12:04Z

/sig network

athenabot · 2020-06-18T23:12:15Z

/triage unresolved

Comment /remove-triage unresolved when the issue is assessed and confirmed.

🤖 I am a bot run by vllry. 👩‍🔬

wangyira · 2020-06-25T21:17:41Z

/assign @rikatz

rikatz · 2020-06-29T13:51:13Z

Hey @david-enli let's take a look into that.

I didn't understood the first part of the issue, when you do a nslookup from the nodes they don't resolve correctly the address?

Also, can you please provide some further information:

Kubernetes version (use kubectl version):
Cloud provider or hardware configuration:
OS (e.g: cat /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Network plugin (CNI) and version (if this is a network-related bug):
Others:

Thank you

rikatz · 2020-06-29T13:51:19Z

/triage needs-information

rikatz · 2020-07-01T12:19:27Z

@david-enli friendly ping

athenabot · 2020-07-02T22:12:13Z

@rikatz
If this issue has been triaged, please comment /remove-triage unresolved.

If you aren't able to handle this issue, consider unassigning yourself and/or adding the help-wanted label.

🤖 I am a bot run by vllry. 👩‍🔬

rikatz · 2020-07-06T14:46:27Z

/remove-triage unresolved

david-enli · 2020-07-06T15:41:37Z

hi @rikatz I sincerely apologize for the delay, this was logged using a different git account and I missed your previous pings. In the meantime, my team member came accross this open issue flannel-io/flannel#1243 and we have adopted one of the solutions provided in the comments: Logging into the node and turn off flannel device check sum by running: ethtool -K flannel.1 tx-checksum-ip-generic off

This issue has present for us since 1.16.10 and it was on both Ubunto and Centos enviornment. The network plugin is flannel. Again, apologize if this not the right place for the issue.

rikatz · 2020-07-06T16:23:05Z

@david-enli no problem :D

I was trying to figure out if this was also related with #88986 and it seems to be. There's a PR already merged for this here #92035 and a good explanation also why this happens.

As this is a dup and already solved, I'll close this issue but please feel free to re-open if you think there's a need to deal with something else

Tks

/close

k8s-ci-robot · 2020-07-06T16:23:20Z

@rikatz: Closing this issue.

In response to this:

@david-enli no problem :D

I was trying to figure out if this was also related with #88986 and it seems to be. There's a PR already merged for this here #92035 and a good explanation also why this happens.

As this is a dup and already solved, I'll close this issue but please feel free to re-open if you think there's a need to deal with something else

Tks

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

david-enli added the kind/bug Categorizes issue or PR as related to a bug. label Jun 18, 2020

k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jun 18, 2020

k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 18, 2020

k8s-ci-robot added the triage/unresolved Indicates an issue that can not or will not be resolved. label Jun 18, 2020

k8s-ci-robot assigned rikatz Jun 25, 2020

k8s-ci-robot added the triage/needs-information Indicates an issue needs more information in order to work on it. label Jun 29, 2020

k8s-ci-robot removed the triage/unresolved Indicates an issue that can not or will not be resolved. label Jul 6, 2020

k8s-ci-robot closed this as completed Jul 6, 2020

dsupure mentioned this issue Sep 1, 2020

Not working in Openshift 4.5 purestorage/pso-csi#80

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

problem encountered while using HostNetworkDNSPolicy #92276

problem encountered while using HostNetworkDNSPolicy #92276

david-enli commented Jun 18, 2020

david-enli commented Jun 18, 2020

athenabot commented Jun 18, 2020

wangyira commented Jun 25, 2020

rikatz commented Jun 29, 2020

rikatz commented Jun 29, 2020

rikatz commented Jul 1, 2020

athenabot commented Jul 2, 2020

rikatz commented Jul 6, 2020

david-enli commented Jul 6, 2020 •

edited

Loading

rikatz commented Jul 6, 2020

k8s-ci-robot commented Jul 6, 2020

problem encountered while using HostNetworkDNSPolicy #92276

problem encountered while using HostNetworkDNSPolicy #92276

Comments

david-enli commented Jun 18, 2020

david-enli commented Jun 18, 2020

athenabot commented Jun 18, 2020

wangyira commented Jun 25, 2020

rikatz commented Jun 29, 2020

rikatz commented Jun 29, 2020

rikatz commented Jul 1, 2020

athenabot commented Jul 2, 2020

rikatz commented Jul 6, 2020

david-enli commented Jul 6, 2020 • edited Loading

rikatz commented Jul 6, 2020

k8s-ci-robot commented Jul 6, 2020

david-enli commented Jul 6, 2020 •

edited

Loading