Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In the same cluster of Submariner, my non-gateway node vxlan interface cannot ping the gateway node's vxlan interface. #3134

Closed
JacobLi11 opened this issue Aug 20, 2024 · 7 comments
Assignees
Labels
bug Something isn't working need-info support

Comments

@JacobLi11
Copy link

What happened:
In the same cluster of Submariner, my non-gateway node vxlan interface cannot ping the gateway node's vxlan interface.

What you expected to happen:
I hope these two nodes can ping
How to reproduce it (as minimally and precisely as possible):

export SERVER_IP="192.168.3.36"
kind create cluster --config - <<EOF
kind: Cluster
name: broker
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
networking:
  apiServerAddress: $SERVER_IP
  podSubnet: "10.7.0.0/16"
  serviceSubnet: "10.77.0.0/16"
EOF

kind create cluster --config - <<EOF
kind: Cluster
name: c1
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
networking:
  apiServerAddress: $SERVER_IP
  podSubnet: "10.8.0.0/16"
  serviceSubnet: "10.88.0.0/16"
EOF

kind create cluster --config - <<EOF
kind: Cluster
name: c2
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
networking:
  apiServerAddress: $SERVER_IP
  podSubnet: "10.9.0.0/16"
  serviceSubnet: "10.99.0.0/16"
EOF
subctl --context kind-broker deploy-broker
subctl --context kind-c1 join broker-info.subm --clusterid c1
subctl --context kind-c2 join broker-info.subm --clusterid c2

Anything else we need to know?:
image

image **Only request packet** but no reply I also deployed it in AWS EKS and had the same problem

Environment:

  • Diagnose information (use subctl diagnose all):
  • Gather information (use subctl gather):
  • Gathering information from cluster "kind-c2"
    ✓ Gathering connectivity resources
    ✓ Gathering CNI data from 2 pods matching label selector "app=submariner-routeagent"
    ✓ Gathering CNI data from 1 pods matching label selector "app=submariner-gateway"
    ✓ Gathering cable driver data from 1 pods matching label selector "app=submariner-gateway"
    ✓ Found 2 endpoints in namespace "submariner-operator"
    ✓ Found 2 clusters in namespace "submariner-operator"
    ✓ Found 1 gateways in namespace "submariner-operator"
    ✓ Found 0 clusterglobalegressips in namespace ""
    ✓ Found 0 globalegressips in namespace ""
    ✓ Found 0 globalingressips in namespace ""
    ✓ Gathering connectivity logs
    ✓ Found 1 pods matching label selector "app=submariner-gateway"
    ✓ Found 2 pods matching label selector "app=submariner-routeagent"
    ✓ Found 1 pods matching label selector "app=submariner-metrics-proxy"
    ✓ Found 0 pods matching label selector "app=submariner-globalnet"
    ✓ Found 0 pods matching label selector "app=submariner-addon"
    ✓ Gathering service-discovery resources
    ✓ Found 2 serviceexports in namespace ""
    ✓ Found 4 serviceimports in namespace ""
    ✓ Found 2 endpointslices by label selector "endpointslice.kubernetes.io/managed-by=lighthouse-agent.submariner.io" in namespace ""
    ✓ Found 1 configmaps by label selector "component=submariner-lighthouse" in namespace "submariner-operator"
    ✓ Found 1 configmaps by field selector "metadata.name=coredns" in namespace "kube-system"
    ✓ Found 0 services by label selector "submariner.io/exportedServiceRef" in namespace ""
    ✓ Gathering service-discovery logs
    ✓ Found 3 pods matching label selector "component=submariner-lighthouse"
    ✓ Found 2 pods matching label selector "k8s-app=kube-dns"
    ✓ Gathering broker resources
    ✓ Found 2 endpoints in namespace "submariner-k8s-broker"
    ✓ Found 2 clusters in namespace "submariner-k8s-broker"
    ✓ Found 2 endpointslices by label selector "endpointslice.kubernetes.io/managed-by=lighthouse-agent.submariner.io" in namespace "submariner-k8s-broker"
    ✓ Found 2 serviceimports in namespace "submariner-k8s-broker"
    ✓ Gathering broker logs
    ✓ Gathering operator resources
    ✓ Found 1 submariners in namespace "submariner-operator"
    ✓ Found 1 servicediscoveries in namespace "submariner-operator"
    ✓ Found 1 deployments by field selector "metadata.name=submariner-operator" in namespace "submariner-operator"
    ✓ Found 1 daemonsets by label selector "app=submariner-gateway" in namespace "submariner-operator"
    ✓ Found 1 daemonsets by label selector "app=submariner-metrics-proxy" in namespace "submariner-operator"
    ✓ Found 1 daemonsets by label selector "app=submariner-routeagent" in namespace "submariner-operator"
    ✓ Found 0 daemonsets by label selector "app=submariner-globalnet" in namespace "submariner-operator"
    ✓ Found 1 deployments by label selector "app=submariner-lighthouse-agent" in namespace "submariner-operator"
    ✓ Found 1 deployments by label selector "app=submariner-lighthouse-coredns" in namespace "submariner-operator"
    ✓ Gathering operator logs
    ✓ Found 1 pods matching label selector "name=submariner-operator"
    Files are stored under directory "submariner-20240820141854/kind-c2"
  • Cloud provider or hardware configuration:
  • Install tools:
    subctl
  • Others:
@JacobLi11 JacobLi11 added the bug Something isn't working label Aug 20, 2024
@JacobLi11
Copy link
Author

Only icmp reqest ,no reply
I think the two ends can communicate normally, but I don't know why my icmp has no reply

@JacobLi11
Copy link
Author

JacobLi11 commented Aug 20, 2024

root@c2-worker:/# ip -d link show vx-submariner
10: vx-submariner: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default
link/ether 36:e4:24:36:bc:3a brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
vxlan id 100 srcport 0 0 dstport 4800 nolearning ttl auto ageing 300 udpcsum noudp6zerocsumtx noudp6zerocsumrx addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
root@c2-worker:/# bridge fdb show dev vx-submariner
00:00:00:00:00:00 dst 172.18.0.5 self permanent
00:00:00:00:00:00 dst 172.18.0.6 self permanent
root@c2-worker:/#
This is my interface information and fdb table

@JacobLi11
Copy link
Author

-A SUBMARINER-POSTROUTING -s 240.0.0.0/8 -o vx-submariner -j SNAT --to-source 10.9.0.1

When I delete this iptables rule, everything works fine. I don't quite understand why this message is needed. Doesn't this make the vxlan tunnel encapsulation ineffective?

@yboaron
Copy link
Contributor

yboaron commented Aug 21, 2024

A. Submariner implements the egress part for inter-cluster traffic and lets the CNI handle ingress direction (after IPsec decryption).

A.1 So, for podA@non_gw_node@clusterA to communicate with podB@non_gw_node@clusterB ,
submariner will handle podA@non_gw_node --> gw_node@clusterA (via vx-submariner interface ) --> IPSec tunnel to remote cluster

A.2 CNI should forward the packet to podB@non_gw_node@clusterB

B.

-A SUBMARINER-POSTROUTING -s 240.0.0.0/8 -o vx-submariner -j SNAT --to-source 10.9.0.1

This rule is used to support communication from HostNetwork pods (that use node's IP address) to remoteCluster, so SRC ip address is SNATed to node's CNI interface IP.

C. Did you try checking inter-cluster connectivity between clusters ? you can use subctl verify

@JacobLi11
Copy link
Author

@yboaron Thank you for your answer~
My traffic here is not cross-cluster traffic, but traffic within this cluster. I want to detect the connectivity between the non-gateway node and the vx-submariner interface on the gateway node, and then I found that the two interfaces cannot be pinged normally.

@yboaron
Copy link
Contributor

yboaron commented Aug 26, 2024

Submariner implements inter-cluster connectivity and by design egress part is handled by Submariner while the CNI is supposed to handle ingress, so the connectivity between vx-submariner interfaces on different nodes in the same cluster should not work as you noticed and it is expected.

Can you elaborate on why you are checking connectivity within a cluster via vx-submariner interfaces? is this for troubleshooting inter-cluster data path connectivity issue ?

@JacobLi11
Copy link
Author

I have this requirement in my use, but I already know the solution, thank you, I will close this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working need-info support
Projects
Status: Done
Development

No branches or pull requests

3 participants