Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flexible-ipam] ARP request leaks cause network issue #5451

Closed
gran-vmv opened this issue Aug 29, 2023 · 7 comments · Fixed by #5657
Closed

[flexible-ipam] ARP request leaks cause network issue #5451

gran-vmv opened this issue Aug 29, 2023 · 7 comments · Fixed by #5657
Assignees
Labels
action/backport Indicates a PR that requires backports. area/ipam Issues or PRs related to IP address management (IPAM). kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.

Comments

@gran-vmv
Copy link
Contributor

gran-vmv commented Aug 29, 2023

Describe the bug
Below e2e test case took too long or failed.
PASS: TestClusterIPv4/HostNetwork_Endpoints/Connect_to_Service_ClusterIP_from_Pod (15.53s)

To Reproduce
Run above e2e test case in flexible-ipam mode.

Expected
Test case passed quickly.

Actual behavior
Test case took too long or failed.

Versions:
Latest main.

Additional context
In this test case, Pod will access a Service ClusterIP with HostNetwork Endpoints.
When the replied packet leave host network, the host will send an ARP request to ask source Pod MAC by source Pod IP.
This ARP request SHA is antrea-gw0 MAC, and SPA is host uplink IP, from antrea-gw0 to OVS bridge, then it will be broadcasted to all ports.
When the other hosts received above ARP request, they will update this SHA-SPA pair to local ARP table, but actually it should not work and causes L2 packet loss between this two host for several seconds.

Key points

  1. ARP request SHA is antrea-gw0 MAC but SPA is host uplink IP
  2. The ARP request is broadcasted to uplink

Solution candidates

  1. Use ARPResponder for all "regular Pods" to reply the ARP request
    Cons: testOVSRestartSameNode will report arping loss during OVS restart.
  2. Use GroupTypeAll for local "regular Pods" and use group action to process the ARP request (How to update the group bucket?)
    Cons: Complex DP change.
  3. Set no-flood to uplink port and output to uplink before normal action for ARP packet from local FlexibleIPAM Pods.
    Cons: Some DP change.
  4. Set net.ipv4.conf.antrea-gw0.arp_announce=1
    Cons: Cannot block user to generate an arp request arping -c 1 -s <node_ip> -I antrea-gw0 <pod_ip>

Risk: 4<1<3<2

@luolanzone
Copy link
Contributor

luolanzone commented Aug 31, 2023

@gran-vmv could you post the possible solutions for this bug here? so we can involve @tnqn @wenyingd to estimate the changes. Thanks.

@gran-vmv
Copy link
Contributor Author

@gran-vmv could you post the possible solutions for this bug here? so we can involve @tnqn @wenyingd to estimate the changes. Thanks.

Updated solution candidates. You can update this description if we have new idea.

@tnqn tnqn added area/ipam Issues or PRs related to IP address management (IPAM). priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Oct 30, 2023
@tnqn tnqn added this to the Antrea v1.15 release milestone Oct 30, 2023
@tnqn
Copy link
Member

tnqn commented Oct 30, 2023

I think this is a serious issue affecting the basic availability of AntreaIPAM mode. Although it's triggered by a specific Service test case, it could happen easily as long as a Node or a hostNetwork Pod tries to reach a local Pod (IP allocated from podCIDR of the Node) via its primary IP.

I have validated the issue can be reproduced via the following steps:

  1. Create a Pod without any special annotation on a Node, to ensure it gets an IP allocated from the podCIDR of the Node.
  2. Access the Node's primary IP from the Pod, then this Node will loss connection with all other Nodes because of the ARP broadcast.
3330: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 00:50:56:a7:d5:6e brd ff:ff:ff:ff:ff:ff
    inet 192.168.240.10/24 brd 192.168.240.255 scope global ens192
       valid_lft forever preferred_lft forever
    inet6 fd02:f0::a/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::250:56ff:fea7:d56e/64 scope link
       valid_lft forever preferred_lft forever
2: ens192~: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP group default qlen 1000
    link/ether 00:50:56:a7:d5:6e brd ff:ff:ff:ff:ff:ff
    inet6 fe80::250:56ff:fea7:d56e/64 scope link
       valid_lft forever preferred_lft forever
3: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether aa:8f:ff:b6:91:13 brd ff:ff:ff:ff:ff:ff
4: antrea-gw0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 5e:53:2c:da:b2:4d brd ff:ff:ff:ff:ff:ff
    inet 192.168.248.1/24 brd 192.168.248.255 scope global antrea-gw0
       valid_lft forever preferred_lft forever
    inet6 fe80::5c53:2cff:feda:b24d/64 scope link
       valid_lft forever preferred_lft forever

Packets captured on antrea-gw0 of the Node, note that the source MAC of the ARP request belongs to antrea-gw0 while the source IP belongs to ens192

02:47:26.848117 96:8f:b9:fb:2d:c3 > 5e:53:2c:da:b2:4d, ethertype IPv4 (0x0800), length 98: 192.168.248.59 > 192.168.240.10: ICMP echo request, id 5888, seq 0, length 64
02:47:26.848152 5e:53:2c:da:b2:4d > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.248.59 tell 192.168.240.10, length 28
02:47:26.848291 96:8f:b9:fb:2d:c3 > 5e:53:2c:da:b2:4d, ethertype ARP (0x0806), length 42: Reply 192.168.248.59 is-at 96:8f:b9:fb:2d:c3, length 28

Ping this Node from another Node, the Node can't be reached after step 2.

$ ping 192.168.240.10
PING 192.168.240.10 (192.168.240.10) 56(84) bytes of data.
64 bytes from 192.168.240.10: icmp_seq=1 ttl=64 time=0.218 ms
64 bytes from 192.168.240.10: icmp_seq=2 ttl=64 time=0.262 ms
64 bytes from 192.168.240.10: icmp_seq=3 ttl=64 time=0.133 ms
^C
--- 192.168.240.10 ping statistics ---
22 packets transmitted, 3 received, 86% packet loss, time 21505ms
rtt min/avg/max/mdev = 0.133/0.204/0.262/0.054 ms

@luolanzone luolanzone added the action/backport Indicates a PR that requires backports. label Oct 30, 2023
@tnqn tnqn mentioned this issue Oct 30, 2023
@gran-vmv
Copy link
Contributor Author

gran-vmv commented Nov 1, 2023

@luolanzone Could you schedule a meeting to discuss with @tnqn @wenyingd about the solution?

@tnqn
Copy link
Member

tnqn commented Nov 1, 2023

This behavior can be controlled via arp_announce. Can we set arp_announce of antrea-gw0 to 1 to avoid using IP out of the subnet for arping? https://sysctl-explorer.net/net/ipv4/arp_announce/

@gran-vmv
Copy link
Contributor Author

gran-vmv commented Nov 1, 2023

This behavior can be controlled via arp_announce. Can we set arp_announce of antrea-gw0 to 1 to avoid using IP out of the subnet for arping? https://sysctl-explorer.net/net/ipv4/arp_announce/

Thanks. It works. Added solution candidate 4 into description.

@gran-vmv
Copy link
Contributor Author

gran-vmv commented Nov 2, 2023

We'll use method 4 to fix this issue according to today's discussion.
Set net.ipv4.conf.antrea-gw0.arp_announce=1
Remove arpSpoofGuardFlow(f.nodeConfig.NodeIPv4Addr.IP, gatewayMAC, f.gatewayPort)

gran-vmv added a commit to gran-vmv/antrea that referenced this issue Nov 2, 2023
Fix antrea-io#5451

Signed-off-by: gran <gran@vmware.com>
gran-vmv added a commit to gran-vmv/antrea that referenced this issue Nov 3, 2023
Fix antrea-io#5451
Set arp_announce to 1 on Linux platform to make the ARP requests from host to gateway
interface always use gateway IP as source IP. These ARP requests without gateway IP will
be dropped by ARP SpoofGuard flow.

Signed-off-by: gran <gran@vmware.com>
gran-vmv added a commit to gran-vmv/antrea that referenced this issue Nov 3, 2023
Fix antrea-io#5451
Set arp_announce to 1 on Linux platform to make the ARP requests from host to gateway
interface always use gateway IP as source IP. These ARP requests without gateway IP will
be dropped by ARP SpoofGuard flow.

Signed-off-by: gran <gran@vmware.com>
gran-vmv added a commit to gran-vmv/antrea that referenced this issue Nov 7, 2023
Fix antrea-io#5451
Set arp_announce to 1 on Linux platform to make the ARP requests sent on the gateway
interface always use the gateway IP as the source IP, otherwise the ARP requests would be
dropped by ARP SpoofGuard flow.

Signed-off-by: gran <gran@vmware.com>
@tnqn tnqn closed this as completed in #5657 Nov 7, 2023
tnqn pushed a commit that referenced this issue Nov 7, 2023
Fix #5451
Set arp_announce to 1 on Linux platform to make the ARP requests sent on the gateway
interface always use the gateway IP as the source IP, otherwise the ARP requests would be
dropped by ARP SpoofGuard flow.

Signed-off-by: gran <gran@vmware.com>
gran-vmv added a commit to gran-vmv/antrea that referenced this issue Nov 8, 2023
Fix antrea-io#5451
Set arp_announce to 1 on Linux platform to make the ARP requests sent on the gateway
interface always use the gateway IP as the source IP, otherwise the ARP requests would be
dropped by ARP SpoofGuard flow.

Signed-off-by: gran <gran@vmware.com>
gran-vmv added a commit to gran-vmv/antrea that referenced this issue Nov 8, 2023
Fix antrea-io#5451
Set arp_announce to 1 on Linux platform to make the ARP requests sent on the gateway
interface always use the gateway IP as the source IP, otherwise the ARP requests would be
dropped by ARP SpoofGuard flow.

Signed-off-by: gran <gran@vmware.com>
gran-vmv added a commit to gran-vmv/antrea that referenced this issue Nov 8, 2023
Fix antrea-io#5451
Set arp_announce to 1 on Linux platform to make the ARP requests sent on the gateway
interface always use the gateway IP as the source IP, otherwise the ARP requests would be
dropped by ARP SpoofGuard flow.

Signed-off-by: gran <gran@vmware.com>
gran-vmv added a commit to gran-vmv/antrea that referenced this issue Nov 8, 2023
Fix antrea-io#5451
Set arp_announce to 1 on Linux platform to make the ARP requests sent on the gateway
interface always use the gateway IP as the source IP, otherwise the ARP requests would be
dropped by ARP SpoofGuard flow.

Signed-off-by: gran <gran@vmware.com>
tnqn pushed a commit that referenced this issue Nov 8, 2023
Fix #5451
Set arp_announce to 1 on Linux platform to make the ARP requests sent on the gateway
interface always use the gateway IP as the source IP, otherwise the ARP requests would be
dropped by ARP SpoofGuard flow.

Signed-off-by: gran <gran@vmware.com>
tnqn pushed a commit that referenced this issue Nov 8, 2023
Fix #5451
Set arp_announce to 1 on Linux platform to make the ARP requests sent on the gateway
interface always use the gateway IP as the source IP, otherwise the ARP requests would be
dropped by ARP SpoofGuard flow.

Signed-off-by: gran <gran@vmware.com>
tnqn pushed a commit that referenced this issue Nov 8, 2023
Fix #5451
Set arp_announce to 1 on Linux platform to make the ARP requests sent on the gateway
interface always use the gateway IP as the source IP, otherwise the ARP requests would be
dropped by ARP SpoofGuard flow.

Signed-off-by: gran <gran@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/backport Indicates a PR that requires backports. area/ipam Issues or PRs related to IP address management (IPAM). kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants