Update benchmarks #46

wingo · 2015-09-30T09:30:50Z

Update the lwAFTR benchmarks, also making sure that the documentation that can generate the benchmarks is up-to-date with the transient program from #43. No external nic_ui / snsh script invocation should be necessary.

The results will be:

A self-test of the "transient" benchmarking harness (is it able to generate the desired load)
- as a CSV file and as a graph of RX/TX MPPS/Gbps over time
A transient load over the lwAFTR showing processed MPPS and Gbps as a function of incoming MPPS / Gbps on a 550-byte packet full-duplex workload
- as a raw CSV file, as a graph of RX/TX Gbps over tiem, and as TX vs RX graphs
- 10 runs on same graph to show variance

Andy has the graphing scripts; Diego to do the benchmarking once we have a final snabb-lwaftr binary.

wingo · 2015-09-30T14:54:24Z

Makefile to create benchmark csv files:

LWAFTR=./snabb-lwaftr

IPV4_PCAP:=/home/dpino/snabbswitch/tests/apps/lwaftr/benchdata/ipv4-0550.pcap
IPV6_PCAP:=/home/dpino/snabbswitch/tests/apps/lwaftr/benchdata/ipv6-0550.pcap
BINDING_TABLE=/home/dpino/snabbswitch/tests/apps/lwaftr/data/binding.table
LWAFTR_CONF=/home/dpino/snabbswitch/tests/apps/lwaftr/data/icmp_on_fail.conf

LWAFTR_IPV4_PCIADDR=0000:81:00.1
LWAFTR_IPV6_PCIADDR=0000:82:00.1

TRANSIENT_IPV4_PCIADDR=0000:81:00.0
TRANSIENT_IPV6_PCIADDR=0000:82:00.0

TRANSIENT_SELF_TEST_NIC1_PCIADDR=0000:81:00.0
TRANSIENT_SELF_TEST_NIC2_PCIADDR=0000:81:00.1

SHELL=/bin/bash

transient-self-test.csv: $(LWAFTR)
    ./snabb-lwaftr transient -s 0.25e9 -D 2 $(IPV4_PCAP) NIC1 $(TRANSIENT_SELF_TEST_NIC1_PCIADDR) $(IPV4_PCAP) NIC2 $(TRANSIENT_SELF_TEST_NIC2_PCIADDR) > $@

lwaftr-%.csv:: $(LWAFTR)
    ( set -m; set -o pipefail; \
      ./snabb-lwaftr run -D 170 --bt $(BINDING_TABLE) --conf $(LWAFTR_CONF) --v4-pci $(LWAFTR_IPV4_PCIADDR) --v6-pci $(LWAFTR_IPV6_PCIADDR) & \
      ./snabb-lwaftr transient -s 0.25e9 -D 2 $(IPV4_PCAP) IPv4 $(TRANSIENT_IPV4_PCIADDR) $(IPV6_PCAP) IPv6 $(TRANSIENT_IPV6_PCIADDR) | tee $@.tmp && \
      mv $@.tmp $@ && \
      wait )

wingo · 2015-10-01T15:35:30Z

So this makefile resulted in poor perf numbers; even with taskset to ensure that the transient and the run executables were run on different sockets, strangely I was peaking at 1.5MPPS. Very odd. But then on separate login shells I ran the programs manually, with taskset, and reliably reached 2MPPS. Very weird. So in the end our perf numbers are from this second, more manual way of running things: one one shell:

for i in `seq 1 10`; do sudo taskset -c 0 ~/snabb-lwaftr-v0.2/snabb-lwaftr transient -D 1 -s 0.25e9 ~/snabb-lwaftr-v0.2/snabbswitch/tests/apps/lwaftr/benchdata/ipv4-0550.pcap IPv4 :81:00.0 ~/snabb-lwaftr-v0.2/snabbswitch/tests/apps/lwaftr/benchdata/ipv6-0550.pcap IPv6 :82:00.0 >lwaftr-$i.csv; done

and on another

for i in `seq 1 10`; do sudo taskset -c 6 ./snabb-lwaftr run -D 80 --bt snabbswitch/tests/apps/lwaftr/data/binding.table --conf snabbswitch/tests/apps/lwaftr/data/icmp_on_fail.conf --v4-pci 0000:81:00.1 --v6-pci 0000:82:00.1; done

wingo · 2015-10-01T16:07:27Z

I finally realized what is going on, I think. Since on interlaken the .0 ports are cabled to the .1 ports it's not possible to run the load generator on a different socket from the lwaftr and be optimal, because there are two NUMA nodes (one for each socket) and the .0 and .1 ports for a given card will probably be on the same NUMA node.

Getting the NUMA node wrong is worse than the cache interference so we should be running on the same socket.

wingo · 2015-10-01T17:04:40Z

Done https://github.com/Igalia/snabbswitch/blob/lwaftr_nutmeg/docs/README.benchmarking.md

lukego · 2015-10-01T22:47:47Z

snabbco#628 (intel10g txdesc prefetch) may improve load generator performance to compensate for NUMA effects. (Seems to in quick testing now.)

wingo · 2015-10-02T07:31:08Z

@lukego interesting! will follow that as we see how we do on smaller packets. for now though, what a relief to finally identify the source of our observed perf variance (numa), and to be able to fix it!

wingo added the lwaftr label Sep 30, 2015

wingo added this to the Proof-of-Concept milestone Sep 30, 2015

wingo self-assigned this Sep 30, 2015

wingo closed this as completed Oct 1, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update benchmarks #46

Update benchmarks #46

wingo commented Sep 30, 2015

wingo commented Sep 30, 2015

wingo commented Oct 1, 2015

wingo commented Oct 1, 2015

wingo commented Oct 1, 2015

lukego commented Oct 1, 2015

wingo commented Oct 2, 2015

Update benchmarks #46

Update benchmarks #46

Comments

wingo commented Sep 30, 2015

wingo commented Sep 30, 2015

wingo commented Oct 1, 2015

wingo commented Oct 1, 2015

wingo commented Oct 1, 2015

lukego commented Oct 1, 2015

wingo commented Oct 2, 2015