Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve multicast performace #1137

Merged
merged 2 commits into from
Dec 3, 2021
Merged

Improve multicast performace #1137

merged 2 commits into from
Dec 3, 2021

Conversation

zhangzujian
Copy link
Member

@zhangzujian zhangzujian commented Dec 2, 2021

What type of this PR

  • Performance

This patch reverts PR #1127 .

Improve multicast performace by not sending multicast packets to conntrack. Rule match=(eth.mcast), action=(next;) is added to table ls_in_pre_lb and ls_out_pre_lb in logical flows:

Datapath: "multicast" (0369bfac-b6fa-4371-8879-49d65217d81f)  Pipeline: ingress
  table=6 (ls_in_pre_lb       ), priority=110  , match=(eth.dst == $svc_monitor_mac), action=(next;)
  table=6 (ls_in_pre_lb       ), priority=110  , match=(eth.mcast), action=(next;)
  table=6 (ls_in_pre_lb       ), priority=110  , match=(ip && inport == "multicast-ovn-cluster"), action=(next;)
  table=6 (ls_in_pre_lb       ), priority=110  , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;)
  table=6 (ls_in_pre_lb       ), priority=100  , match=(ip), action=(reg0[2] = 1; next;)
  table=6 (ls_in_pre_lb       ), priority=0    , match=(1), action=(next;)
Datapath: "multicast" (0369bfac-b6fa-4371-8879-49d65217d81f)  Pipeline: egress
  table=0 (ls_out_pre_lb      ), priority=110  , match=(eth.mcast), action=(next;)
  table=0 (ls_out_pre_lb      ), priority=110  , match=(eth.src == $svc_monitor_mac), action=(next;)
  table=0 (ls_out_pre_lb      ), priority=110  , match=(ip && outport == "multicast-ovn-cluster"), action=(next;)
  table=0 (ls_out_pre_lb      ), priority=110  , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;)
  table=0 (ls_out_pre_lb      ), priority=100  , match=(ip), action=(reg0[2] = 1; next;)
  table=0 (ls_out_pre_lb      ), priority=0    , match=(1), action=(next;)

Performance Testing

Create a subnet and two iperf2 Pods:

switch f2956855-7e2f-4773-b05d-4bebc405e803 (multicast)
    port iperf-s.multicast
        addresses: ["00:00:00:B5:02:9A 172.17.0.2"]
    port iperf-c.multicast
        addresses: ["00:00:00:CF:72:92 172.17.0.3"]
    port multicast-ovn-cluster
        type: router
        router-port: ovn-cluster-multicast

OVN trace result after applying the patch:

root@master:/kube-ovn# ovn-trace --ct=new multicast "inport == \"iperf-c.multicast\" && eth.src == 00:00:00:CF:72:92 && ip4.src == 172.17.0.3 && eth.dst == 01:00:5E:00:01:01 && ip4.dst == 224.0.1.1"
# ip,reg14=0x3,vlan_tci=0x0000,dl_src=00:00:00:5e:88:37,dl_dst=01:00:5e:00:01:01,nw_src=172.17.0.3,nw_dst=224.0.1.1,nw_proto=0,nw_tos=0,nw_ecn=0,nw_ttl=0

ingress(dp="multicast", inport="iperf-c.multicast")
---------------------------------------------------
 0. ls_in_port_sec_l2 (ovn-northd.c:4837): inport == "iperf-c.multicast", priority 50, uuid e87f53ae
    next;
 6. ls_in_pre_lb (ovn-northd.c:5176): eth.mcast, priority 110, uuid d09ec031
    next;
 8. ls_in_acl_hint (ovn-northd.c:5379): !ct.trk, priority 5, uuid 2cd06df8
    reg0[8] = 1;
    reg0[9] = 1;
    next;
22. ls_in_l2_lkup (ovn-northd.c:7371): eth.mcast, priority 70, uuid d9fc74ec
    outport = "_MC_flood";
    output;

multicast(dp="multicast", mcgroup="_MC_flood")
----------------------------------------------

    egress(dp="multicast", inport="iperf-c.multicast", outport="iperf-c.multicast")
    -------------------------------------------------------------------------------
            /* omitting output because inport == outport && !flags.loopback */

    egress(dp="multicast", inport="iperf-c.multicast", outport="iperf-s.multicast")
    -------------------------------------------------------------------------------
         0. ls_out_pre_lb (ovn-northd.c:5178): eth.mcast, priority 110, uuid df3be655
            next;
         3. ls_out_acl_hint (ovn-northd.c:5379): !ct.trk, priority 5, uuid 71746e17
            reg0[8] = 1;
            reg0[9] = 1;
            next;
         9. ls_out_port_sec_l2 (ovn-northd.c:4950): eth.mcast, priority 100, uuid 7493f47e
            output;
            /* output to "iperf-s.multicast", type "" */

    egress(dp="multicast", inport="iperf-c.multicast", outport="multicast-ovn-cluster")
    -----------------------------------------------------------------------------------
         0. ls_out_pre_lb (ovn-northd.c:5178): eth.mcast, priority 110, uuid df3be655
            next;
         3. ls_out_acl_hint (ovn-northd.c:5379): !ct.trk, priority 5, uuid 71746e17
            reg0[8] = 1;
            reg0[9] = 1;
            next;
         9. ls_out_port_sec_l2 (ovn-northd.c:4950): eth.mcast, priority 100, uuid 7493f47e
            output;
            /* output to "multicast-ovn-cluster", type "patch" */

        ingress(dp="ovn-cluster", inport="ovn-cluster-multicast")
        ---------------------------------------------------------
         0. lr_in_admission (ovn-northd.c:9564): eth.mcast && inport == "ovn-cluster-multicast", priority 50, uuid c113673c
            xreg0[0..47] = 00:00:00:49:12:56;
            next;
         1. lr_in_lookup_neighbor (ovn-northd.c:9646): 1, priority 0, uuid 4c9d6761
            reg9[2] = 1;
            next;
         2. lr_in_learn_neighbor (ovn-northd.c:9655): reg9[2] == 1, priority 100, uuid f095be55
            next;
         3. lr_in_ip_input (ovn-northd.c:10753): ip4.mcast || ip6.mcast, priority 82, uuid 4ac8e946
            drop;

Performance comparison when two Pods are on the same node:

/ $ iperf -c 226.94.1.1 -u -T 32 -t 10 -i 1 -b 10G
------------------------------------------------------------
Client connecting to 226.94.1.1, UDP port 5001
Sending 1470 byte datagrams, IPG target: 1.10 us (kalman adjust)
UDP buffer size:  256 KByte (default)
------------------------------------------------------------
[  1] local 172.17.0.13 port 42302 connected with 226.94.1.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-1.00 sec  46.2 MBytes   388 Mbits/sec
[  1] 1.00-2.00 sec  47.1 MBytes   395 Mbits/sec
[  1] 2.00-3.00 sec  46.8 MBytes   392 Mbits/sec
[  1] 3.00-4.00 sec  44.5 MBytes   373 Mbits/sec
[  1] 4.00-5.00 sec  47.4 MBytes   398 Mbits/sec
[  1] 5.00-6.00 sec  45.2 MBytes   379 Mbits/sec
[  1] 6.00-7.00 sec  45.8 MBytes   384 Mbits/sec
[  1] 7.00-8.00 sec  45.1 MBytes   378 Mbits/sec
[  1] 8.00-9.00 sec  41.1 MBytes   345 Mbits/sec
[  1] 9.00-10.00 sec  48.1 MBytes   403 Mbits/sec
[  1] 0.00-10.00 sec   457 MBytes   384 Mbits/sec
[  1] Sent 326171 datagrams
/ $ iperf -c 226.94.1.1 -u -T 32 -t 10 -i 1 -b 10G
------------------------------------------------------------
Client connecting to 226.94.1.1, UDP port 5001
Sending 1470 byte datagrams, IPG target: 1.10 us (kalman adjust)
UDP buffer size:  256 KByte (default)
------------------------------------------------------------
[  1] local 172.17.0.3 port 49802 connected with 226.94.1.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-1.00 sec   205 MBytes  1.72 Gbits/sec
[  1] 1.00-2.00 sec   229 MBytes  1.92 Gbits/sec
[  1] 2.00-3.00 sec   226 MBytes  1.89 Gbits/sec
[  1] 3.00-4.00 sec   210 MBytes  1.76 Gbits/sec
[  1] 4.00-5.00 sec   219 MBytes  1.84 Gbits/sec
[  1] 5.00-6.00 sec   224 MBytes  1.88 Gbits/sec
[  1] 6.00-7.00 sec   212 MBytes  1.78 Gbits/sec
[  1] 7.00-8.00 sec   214 MBytes  1.80 Gbits/sec
[  1] 8.00-9.00 sec   214 MBytes  1.79 Gbits/sec
[  1] 9.00-10.00 sec   193 MBytes  1.62 Gbits/sec
[  1] 0.00-10.00 sec  2.10 GBytes  1.80 Gbits/sec
[  1] Sent 1530462 datagrams
Case Bandwidth
Before 384 Mbits/sec
After 1.80 Gbits/sec

Performance comparison when two Pods are on different nodes:

/ $ iperf -c 226.94.1.1 -u -T 32 -t 10 -i 1 -b 10G
------------------------------------------------------------
Client connecting to 226.94.1.1, UDP port 5001
Sending 1470 byte datagrams, IPG target: 1.10 us (kalman adjust)
UDP buffer size:  256 KByte (default)
------------------------------------------------------------
[  1] local 172.17.0.11 port 44717 connected with 226.94.1.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-1.00 sec  50.2 MBytes   421 Mbits/sec
[  1] 1.00-2.00 sec  54.9 MBytes   461 Mbits/sec
[  1] 2.00-3.00 sec  53.4 MBytes   448 Mbits/sec
[  1] 3.00-4.00 sec  48.3 MBytes   405 Mbits/sec
[  1] 4.00-5.00 sec  46.1 MBytes   387 Mbits/sec
[  1] 5.00-6.00 sec  49.5 MBytes   415 Mbits/sec
[  1] 6.00-7.00 sec  51.4 MBytes   431 Mbits/sec
[  1] 7.00-8.00 sec  52.8 MBytes   443 Mbits/sec
[  1] 8.00-9.00 sec  49.9 MBytes   419 Mbits/sec
[  1] 9.00-10.00 sec  58.4 MBytes   490 Mbits/sec
[  1] 0.00-10.00 sec   515 MBytes   432 Mbits/sec
[  1] Sent 367358 datagrams
/ $ iperf -c 226.94.1.1 -u -T 32 -t 10 -i 1 -b 10G
------------------------------------------------------------
Client connecting to 226.94.1.1, UDP port 5001
Sending 1470 byte datagrams, IPG target: 1.10 us (kalman adjust)
UDP buffer size:  256 KByte (default)
------------------------------------------------------------
[  1] local 172.17.0.9 port 36975 connected with 226.94.1.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-1.00 sec   103 MBytes   865 Mbits/sec
[  1] 1.00-2.00 sec   100 MBytes   839 Mbits/sec
[  1] 2.00-3.00 sec   111 MBytes   930 Mbits/sec
[  1] 3.00-4.00 sec   104 MBytes   875 Mbits/sec
[  1] 4.00-5.00 sec   109 MBytes   916 Mbits/sec
[  1] 5.00-6.00 sec   100 MBytes   839 Mbits/sec
[  1] 6.00-7.00 sec  94.8 MBytes   795 Mbits/sec
[  1] 7.00-8.00 sec   103 MBytes   862 Mbits/sec
[  1] 8.00-9.00 sec   104 MBytes   875 Mbits/sec
[  1] 9.00-10.00 sec   119 MBytes  1.00 Gbits/sec
[  1] 0.00-10.00 sec  1.02 GBytes   880 Mbits/sec
[  1] Sent 748058 datagrams
Case Bandwidth
Before 432 Mbits/sec
After 880 Gbits/sec

@zhangzujian zhangzujian changed the title Feat/multicast Improve multicast performace Dec 2, 2021
@zhangzujian zhangzujian marked this pull request as ready for review December 3, 2021 04:16
Copy link
Collaborator

@oilbeater oilbeater left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@zhangzujian zhangzujian merged commit 90f62fd into master Dec 3, 2021
@zhangzujian zhangzujian deleted the feat/multicast branch December 3, 2021 05:31
@zhangzujian zhangzujian added the performance Anything that can make Kube-OVN faster label Dec 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Anything that can make Kube-OVN faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants