Skip to content

Azure with netvsc DPDK driver (v2.89)

Malcolm Bumgardner edited this page Mar 26, 2021 · 7 revisions

netvsc is the new model in Azure that replace failsafe and has the same performance and VF acceleration with multi-core support.

Build Ubuntu VM in Azure

az vm create --resource-group rgXYZ --name TrexUbuntuAN --image Canonical:UbuntuServer:18_04-lts-gen2:18.04.202010140 --size Standard_D16ds_v4 --admin-username azureuser --admin-password trexTesting --nics ANLinux_eth0_NIC ANLinux_eth1_NIC ANLinux_eth2_NIC
Note
Do not add AN on eth0/management to cut down on mapping confusion with MLX But we may want to add it in the future to cut down interrupts from eth0 that may end up on the TREX cores running at 100%

Follow Azure DPDK setup steps

  • The Ubuntu Azure kernel provides the best network performance on Azure

sudo add-apt-repository ppa:canonical-server/dpdk-azure -y
sudo apt-get update
sudo apt-get upgrade -y
sudo apt-get dist-upgrade
sudo apt-get install -y librdmacm-dev librdmacm1 build-essential libnuma-dev libmnl-dev
sudo apt install ibverbs-utils

lsb_release -a

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.4 LTS
Release:        18.04
Codename:       bionic

Setup huge pages on reboot

sudo vi /etc/default/grub
# default_hugepagesz=1GB hugepagesz=1G hugepages=8 transparent_hugepage=never
# GRUB_CMDLINE_LINUX=" default_hugepagesz=1GB hugepagesz=1G hugepages=8 transparent_hugepage=never "


cat /etc/default/grub
    # If you change this file, run 'update-grub' afterwards to update
    # /boot/grub/grub.cfg.
    # For full documentation of the options in this file, see:
    #   info -f grub -n 'Simple configuration'

    GRUB_DEFAULT=0
    GRUB_TIMEOUT_STYLE=hidden
    GRUB_TIMEOUT=0
    GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
    GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
    GRUB_CMDLINE_LINUX=" default_hugepagesz=1GB hugepagesz=1G hugepages=8 transparent_hugepage=never"

    # Uncomment to enable BadRAM filtering, modify to suit your needs
    # This works with Linux (no patch required) and with any kernel that obtains
    # the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)

    #GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"
    # Uncomment to disable graphical terminal (grub-pc only)

    #GRUB_TERMINAL=console

    # The resolution used on graphical terminal
    # note that you can use only modes which your graphic card supports via VBE
    # you can see them in real GRUB with the command `vbeinfo'
    #GRUB_GFXMODE=640x480

    # Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
    #GRUB_DISABLE_LINUX_UUID=true


    # Uncomment to disable generation of recovery mode menu entries
    #GRUB_DISABLE_RECOVERY="true"


    # Uncomment to get a beep at grub start
    #GRUB_INIT_TUNE="480 440 1"


sudo update-grub
sudo vi /etc/fstab
# nodev /mnt/huge hugetlbfs defaults 0 0


cat /etc/fstab
    # CLOUD_IMG: This file was created/modified by the Cloud Image build process
    UUID=8c0a4742-2f51-40b4-b659-357cfb0bb2a3       /        ext4   defaults,discard        0 0
    UUID=5BCE-FF6A  /boot/efi       vfat    defaults,discard        0 0
    nodev /mnt/huge hugetlbfs defaults 0 0
    /dev/disk/cloud/azure_resource-part1    /mnt    auto    defaults,nofail,x-systemd.requires=cloud-init.service,comment=cloudconfig      0  2

Load Azure drivers on reboot

sudo vi /etc/modules-load.d/modules.conf
# ib_uverbs
# mlx4_ib
# mlx5_ib


cat /etc/modules-load.d/modules.conf
    # /etc/modules: kernel modules to load at boot time.
    #
    # This file contains the names of kernel modules that should be loaded
    # at boot time, one per line. Lines beginning with "#" are ignored.
    ib_uverbs
    mlx4_ib
    mlx5_ib

Reboot VM

sudo reboot

After reboot of VM

Validate huge pages and Infiniband drivers are loaded

cat /proc/meminfo | grep Huge
lsmod | grep ib_uverbs

NETVSC PMD supported (instead of Failsafe/TAP) with TREX v2.89 includes DPDK 21.02

Note
MLX4/CX-3 in latest TREX with DPDK21.02 was disabled (Seems like it can be re-enabled as still supported driver in DPDK 21.02).

For hv_netvsc interfaces need to load uio_hv_generic and unbind eth1/eth2 interfaces from the kernel so TREX can bind to eth1/eth2

Example script added below to assist with this process.

For Azure need to configure and build to use Azure installed RDMA-CORE libs

Setup for build with TREX

cd ~
sudo apt-get install -y python3-distutils
sudo apt install zlib1g-dev

Build with TREX with Azure installed RDMA-CORE libraries instead of OFED install

git clone https://github.com/cisco-system-traffic-generator/trex-core.git
cd trex-core
cd linux_dpdk/


./b configure --no-ofed-check

./b build

Unbind eth1 and eth2 from kernel

 cd ..
 cd scripts/

cat ./azure_trex_setup.sh
    #!/bin/bash

    sudo modprobe uio_hv_generic

    NET_UUID="f8615163-df3e-46c5-913f-f2d2f965ed0e"
    echo $NET_UUID | sudo tee /sys/bus/vmbus/drivers/uio_hv_generic/new_id

    DEV_UUID=$(basename $(readlink /sys/class/net/eth1/device))
    echo $DEV_UUID | sudo tee /sys/bus/vmbus/drivers/hv_netvsc/unbind
    echo $DEV_UUID | sudo tee /sys/bus/vmbus/drivers/uio_hv_generic/bind

    DEV_UUID=$(basename $(readlink /sys/class/net/eth2/device))
    echo $DEV_UUID | sudo tee /sys/bus/vmbus/drivers/hv_netvsc/unbind
    echo $DEV_UUID | sudo tee /sys/bus/vmbus/drivers/uio_hv_generic/bindcat ./azure_trex_setup.sh
./azure_trex_setup.sh

f8615163-df3e-46c5-913f-f2d2f965ed0e
000d3a9b-73fd-000d-3a9b-73fd000d3a9b
000d3a9b-73fd-000d-3a9b-73fd000d3a9b
000d3a9b-7b26-000d-3a9b-7b26000d3a9b
000d3a9b-7b26-000d-3a9b-7b26000d3a9b

Create trex_cfg.yaml for the system

Need to update trex_cfg.yaml based on MLX PCI address and the UUIDs for hv_netvsc interfaces. Also IP addresses and GWs are specific to VM setup and only examples.

lspci
2180:00:02.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] (rev 80)
c7e1:00:02.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] (rev 80)
cat /etc/trex_cfg.yaml
- version: 2
  interfaces: ['2180:00:02.0', 'c7e1:00:02.0']
  ext_dpdk_opt: ['--vdev=net_vdev_netvsc,ignore=0', '--vdev=net_vdev_netvsc,ignore=0']
  interfaces_vdevs : ['000d3a9b-73fd-000d-3a9b-73fd000d3a9b','000d3a9b-7b26-000d-3a9b-7b26000d3a9b']
  rx_desc : 1024
  tx_desc : 1024
  port_bandwidth_gb : 10
  port_speed : 10000
  port_info:
      - ip: 10.90.23.101
        default_gw: 10.90.23.202
      - ip: 10.90.130.101
        default_gw: 10.90.130.202

  platform:
      master_thread_id: 0
      latency_thread_id: 2
      dual_if:
        - socket: 0
          threads: [4, 6, 8, 10]

Turn off TSO for MLX devices

sudo ethtool -K enP51169s2 tso off gro off gso off
sudo ethtool -K enP8576s3 tso off gro off gso off

Run TREX

With NETVSC PMD in Azure not limited to “-c 1”

cd ~/trex-core/scripts
sudo ./t-rex-64 -i -c 2 -v 7 --no-ofed-check

Run tui/bench

stty cols 111 rows 45
cd ~/trex-core/scripts
./trex-console
trex> tui
tui> start -f stl/bench.py -m 800kpps --port 0 1 --force -t size=1514
TUI output
Global Statistics

connection   : localhost, Port 4501                       total_tx_L2  : 19.43 Gbps
version      : STL @ v2.89                                total_tx_L1  : 19.69 Gbps
cpu_util.    : 27.76% @ 2 cores (2 per dual port)         total_rx     : 19.43 Gbps
rx_cpu_util. : 0.0% / 0 pps                               total_pps    : 1.6 Mpps
async_util.  : 0% / 161.14 bps                            drop_rate    : 0 bps
total_cps.   : 0 cps                                      queue_full   : 0 pkts

Port Statistics

   port    |         0         |         1         |       total
-----------+-------------------+-------------------+------------------
owner      |         azureuser |         azureuser |
link       |                UP |                UP |
state      |      TRANSMITTING |      TRANSMITTING |
speed      |           10 Gb/s |           10 Gb/s |
CPU util.  |            27.76% |            27.76% |
--         |                   |                   |
Tx bps L2  |         9.71 Gbps |         9.71 Gbps |        19.43 Gbps
Tx bps L1  |         9.84 Gbps |         9.84 Gbps |        19.69 Gbps
Tx pps     |       802.08 Kpps |       802.08 Kpps |          1.6 Mpps
Line Util. |           98.43 % |           98.43 % |
---        |                   |                   |
Rx bps     |         9.71 Gbps |         9.71 Gbps |        19.43 Gbps
Rx pps     |       802.07 Kpps |       802.07 Kpps |          1.6 Mpps
----       |                   |                   |
opackets   |          13079888 |          13079888 |          26159776
ipackets   |          13079887 |          13079901 |          26159788
obytes     |       19802950432 |       19802950432 |       39605900864
ibytes     |       19802948918 |       19802970114 |       39605919032
tx-pkts    |       13.08 Mpkts |       13.08 Mpkts |       26.16 Mpkts
rx-pkts    |       13.08 Mpkts |       13.08 Mpkts |       26.16 Mpkts
tx-bytes   |           19.8 GB |           19.8 GB |          39.61 GB
rx-bytes   |           19.8 GB |           19.8 GB |          39.61 GB
-----      |                   |                   |
oerrors    |                 0 |                 0 |                 0
ierrors    |                 0 |                 0 |                 0

Verify xstats from NetVSC to ensure most of the traffic is RX/TX from the MLX VF

trex>stats --xz
Xstats

            Name:              |     Port 0:     |     Port 1:
-------------------------------+-----------------+----------------
rx_good_packets                |      1091960901 |      1091939757
tx_good_packets                |      1092492287 |      1092492431
rx_good_bytes                  |   1648860943830 |   1648829019170
tx_good_bytes                  |   1649663337420 |   1649663554860
rx_missed_errors               |               0 |               0
rx_errors                      |               0 |               0
tx_errors                      |               0 |               0
rx_mbuf_allocation_errors      |               0 |               0
rx_q0_packets                  |          144891 |               3
rx_q0_bytes                    |       218772900 |             360
rx_q0_errors                   |               0 |               0
rx_q1_packets                  |               3 |          188725
rx_q1_bytes                    |             360 |       284965020
rx_q1_errors                   |               0 |               0
tx_q0_packets                  |               0 |               0
tx_q0_bytes                    |               0 |               0
tx_q1_packets                  |               0 |               0
tx_q1_bytes                    |               0 |               0
tx_q2_packets                  |               0 |               0
tx_q2_bytes                    |               0 |               0
tx_q3_packets                  |               0 |               0
tx_q3_bytes                    |               0 |               0
tx_q0_good_packets             |               0 |               0
tx_q0_good_bytes               |               0 |               0
tx_q0_errors                   |               0 |               0
tx_q0_ring full                |               0 |               0
tx_q0_channel full             |               0 |               0
tx_q0_multicast_packets        |               0 |               0
tx_q0_broadcast_packets        |               0 |               0
tx_q0_undersize_packets        |               0 |               0
tx_q0_size_64_packets          |               0 |               0
tx_q0_size_65_127_packets      |               0 |               0
tx_q0_size_128_255_packets     |               0 |               0
tx_q0_size_256_511_packets     |               0 |               0
tx_q0_size_512_1023_packets    |               0 |               0
tx_q0_size_1024_1518_packets   |               0 |               0
tx_q0_size_1519_max_packets    |               0 |               0
tx_q1_good_packets             |               0 |               0
tx_q1_good_bytes               |               0 |               0
tx_q1_errors                   |               0 |               0
tx_q1_ring full                |               0 |               0
tx_q1_channel full             |               0 |               0
tx_q1_multicast_packets        |               0 |               0
tx_q1_broadcast_packets        |               0 |               0
tx_q1_undersize_packets        |               0 |               0
tx_q1_size_64_packets          |               0 |               0
tx_q1_size_65_127_packets      |               0 |               0
tx_q1_size_128_255_packets     |               0 |               0
tx_q1_size_256_511_packets     |               0 |               0
tx_q1_size_512_1023_packets    |               0 |               0
tx_q1_size_1024_1518_packets   |               0 |               0
tx_q1_size_1519_max_packets    |               0 |               0
tx_q2_good_packets             |               0 |               0
tx_q2_good_bytes               |               0 |               0
tx_q2_errors                   |               0 |               0
tx_q2_ring full                |               0 |               0
tx_q2_channel full             |               0 |               0
tx_q2_multicast_packets        |               0 |               0
tx_q2_broadcast_packets        |               0 |               0
tx_q2_undersize_packets        |               0 |               0
tx_q2_size_64_packets          |               0 |               0
tx_q2_size_65_127_packets      |               0 |               0
tx_q2_size_128_255_packets     |               0 |               0
tx_q2_size_256_511_packets     |               0 |               0
tx_q2_size_512_1023_packets    |               0 |               0
tx_q2_size_1024_1518_packets   |               0 |               0
tx_q2_size_1519_max_packets    |               0 |               0
tx_q3_good_packets             |               0 |               0
tx_q3_good_bytes               |               0 |               0
tx_q3_errors                   |               0 |               0
tx_q3_ring full                |               0 |               0
tx_q3_channel full             |               0 |               0
tx_q3_multicast_packets        |               0 |               0
tx_q3_broadcast_packets        |               0 |               0
tx_q3_undersize_packets        |               0 |               0
tx_q3_size_64_packets          |               0 |               0
tx_q3_size_65_127_packets      |               0 |               0
tx_q3_size_128_255_packets     |               0 |               0
tx_q3_size_256_511_packets     |               0 |               0
tx_q3_size_512_1023_packets    |               0 |               0
tx_q3_size_1024_1518_packets   |               0 |               0
tx_q3_size_1519_max_packets    |               0 |               0
rx_q0_good_packets             |          144891 |               3
rx_q0_good_bytes               |       218772900 |             360
rx_q0_ring full                |               0 |               0
rx_q0_channel full             |               0 |               0
rx_q0_multicast_packets        |               0 |               0
rx_q0_broadcast_packets        |               0 |               0
rx_q0_undersize_packets        |               0 |               0
rx_q0_size_64_packets          |               0 |               0
rx_q0_size_65_127_packets      |               9 |               3
rx_q0_size_128_255_packets     |               0 |               0
rx_q0_size_256_511_packets     |               0 |               0
rx_q0_size_512_1023_packets    |               0 |               0
rx_q0_size_1024_1518_packets   |          144882 |               0
rx_q0_size_1519_max_packets    |               0 |               0
rx_q1_good_packets             |               3 |          188725
rx_q1_good_bytes               |             360 |       284965020
rx_q1_ring full                |               0 |               0
rx_q1_channel full             |               0 |               0
rx_q1_multicast_packets        |               0 |               0
rx_q1_broadcast_packets        |               0 |               0
rx_q1_undersize_packets        |               0 |               0
rx_q1_size_64_packets          |               0 |               0
rx_q1_size_65_127_packets      |               3 |               7
rx_q1_size_128_255_packets     |               0 |               0
rx_q1_size_256_511_packets     |               0 |               0
rx_q1_size_512_1023_packets    |               0 |               0
rx_q1_size_1024_1518_packets   |               0 |          188718
rx_q1_size_1519_max_packets    |               0 |               0
vf_rx_good_packets             |      1091815995 |      1091751029
vf_tx_good_packets             |      1092492287 |      1092492431
vf_rx_good_bytes               |   1648642152450 |   1648544053790
vf_tx_good_bytes               |   1649663337420 |   1649663554860
vf_rx_missed_errors            |               0 |               0
vf_rx_errors                   |               0 |               0
vf_tx_errors                   |               0 |               0
vf_rx_mbuf_allocation_errors   |               0 |               0
vf_rx_q0_packets               |      1091815995 |               0
vf_rx_q0_bytes                 |   1648642152450 |               0
vf_rx_q0_errors                |               0 |               0
vf_rx_q1_packets               |               0 |      1091751029
vf_rx_q1_bytes                 |               0 |   1648544053790
vf_rx_q1_errors                |               0 |               0
vf_tx_q0_packets               |       546242717 |       546242781
vf_tx_q0_bytes                 |    824826486720 |    824826583360
vf_tx_q1_packets               |       546249570 |       546249650
vf_tx_q1_bytes                 |    824836850700 |    824836971500
vf_tx_q2_packets               |               0 |               0
vf_tx_q2_bytes                 |               0 |               0
vf_tx_q3_packets               |               0 |               0
vf_tx_q3_bytes                 |               0 |               0
vf_rx_wqe_errors               |               0 |               0
vf_rx_unicast_packets          |      1091815853 |      1091750922
vf_rx_unicast_bytes            |   1648641931990 |   1648543892220
vf_tx_unicast_packets          |      1092492148 |      1092492348
vf_tx_unicast_bytes            |   1649663143480 |   1649663442460
vf_rx_multicast_packets        |               0 |               0
vf_rx_multicast_bytes          |               0 |               0
vf_tx_multicast_packets        |               0 |               0
vf_tx_multicast_bytes          |               0 |               0
vf_rx_broadcast_packets        |               0 |               0
vf_rx_broadcast_bytes          |               0 |               0
vf_tx_broadcast_packets        |              11 |              11
vf_tx_broadcast_bytes          |             660 |             660
vf_tx_phy_packets              |               0 |               0
vf_rx_phy_packets              |               0 |               0
vf_rx_phy_crc_errors           |               0 |               0
vf_tx_phy_bytes                |               0 |               0
vf_rx_phy_bytes                |               0 |               0
vf_rx_phy_in_range_len_error   |               0 |               0
vf_rx_phy_symbol_errors        |               0 |               0
vf_rx_phy_discard_packets      |               0 |               0
vf_tx_phy_discard_packets      |               0 |               0
vf_tx_phy_errors               |               0 |               0
vf_rx_out_of_buffer            |               0 |               0
vf_tx_pp_missed_interrupt_er   |               0 |               0
vf_tx_pp_rearm_queue_errors    |               0 |               0
vf_tx_pp_clock_queue_errors    |               0 |               0
vf_tx_pp_timestamp_past_erro   |               0 |               0
vf_tx_pp_timestamp_future_er   |               0 |               0
vf_tx_pp_jitter                |               0 |               0
vf_tx_pp_wander                |               0 |               0
vf_tx_pp_sync_lost             |               0 |               0