Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement l2fwd-like app #736

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open

Implement l2fwd-like app #736

wants to merge 13 commits into from

Conversation

dpino
Copy link
Contributor

@dpino dpino commented Feb 2, 2016

Basic implementation of l2fwd app (like DPDK's). This is how I understand it should work:

Given the following conf:

  • 01:00.0 <-> 02:00.0
  • 01:00.1 <-> 02:00.1

A packetblaster sends packets into 01:00.0.
l2fwd forwards packets received on 02:00.0 to 01:00.1.

  • The app reports Gbps of link.
    Alternatively, a separate program could measure packets received on 02:00.1.

@lukego
Copy link
Member

lukego commented Feb 3, 2016

Neat :).

I appreciate that you have made a simple solution for choosing between different I/O sources (tap, virtio, 10G) in select_nic_driver(). This is a nice baby-step towards solving that problem more generally in the future i.e. making Snabb programs more flexible about their I/O sources.

Aside: I would really like to have a variant of our standard CI benchmark that uses this l2fwd instead of the DPDK one. Then we could make sure that we have the same performance with both guest Virtio-net drivers. @eugeneia would be the person to talk with about how to set this up.

local input, output = assert(self.input.input), assert(self.output.output)

while not link.empty(input) do
link.transmit(output, link.receive(input))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DPDK L2FWD app actually does some processingon each packet. See here: http://dpdk.org/browse/dpdk/tree/examples/l2fwd/main.c#n239

The main idea is that if I send traffic with a destination MAC A and I don't change it in the L2FWD, then all possible switches between the sender and the L2FWD will get messed. Like if we use a VMDq port then the integrated switch will learn the MAC A on a ceratin port and won't forward the returned packet to the external port.

So what DPDK does is to replace each packet's destination MAC address with 02:00:00:00:00:xx, where xx is the number for the port it came from (DPDK actually assigns numbers to each port it recognizes). And then also changes the source with the MAC of the port it send it through. So it is more or less a real forwarding.

But SnabbSwitch is in an even better position since it actually has a real switching app - the learning bridge. I guess it is the most suitable in this situation.

@eugeneia
Copy link
Member

eugeneia commented Feb 3, 2016

@dpino Without having looked too closely, I am absolutely in love with the the idea that we could use this as an alternative to DPDK's l2fwd. It wouldn't replace DPDK as we need that for interop testing but it could make benchmarking much simpler. 👍

config.app(c, "nic1", driver1, {pciaddr = pciaddr1})
config.app(c, "nic2", driver2, {pciaddr = pciaddr2})

config.link(c, "nic1.tx -> nic2.rx")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get also config.link(c, "nic2.tx -> nic1.rx") here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nnikolaev-virtualopensystems Am I right in assuming that would make l2fwd a two-way street?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, this makes it full featured 2-way "forwarder".

@dpino
Copy link
Contributor Author

dpino commented Feb 11, 2016

@Eugenia Sorry, I was a bit away lately. I will tackle these issues in the coming days and verify it works in the context of packetblaster->snabbnfv<->vm(snabb+l2fwd). Thanks everyone for the review and comments.

@eugeneia eugeneia self-assigned this Feb 11, 2016
@eugeneia
Copy link
Member

@dpino Hey, no rush at all. I am super supportive of taking it easy.

@dpino
Copy link
Contributor Author

dpino commented Feb 26, 2016

I think this is working now. Here is how it works:

Let's assume the following scheme:

Host1 Host 2
01:00.0 <-> 01:00.0
01:00.1 <-> 01:00.1

L2Fwd creates a full-duplex softwire in Host2 between 01:00.0 and 01:00.1.

Packetblaster blasts packets into Host 1 01:00.0 which eventually reach Host 2 01:00.0, because both cards are wired together. The packets are forwarded to Host 2 01:00.1 and transmitted. Finally the packets arrive to Host1 01:00.1

To make packets come back the other way, it's necessary to create a packet bouncer in Host 1 01:00.1. I added a new packetblaster mode called 'bounce' which takes only two PCI params, the NIC where to blast packets to and the NIC of the bouncer, which sends back any traffic it receives.

With regard to mimic the MAC addresses scheme DPDK's l2fwd app uses, I didn't follow it as packets get through anyway. I think what's interesting of this app is to have a tool which can be a counterpart of packetblaster and can help verify NIC links are working.

Example of use:

Host 1

$ sudo ./snabb packetblaster bounce my.pcap 0000:02:00.0 0000:03:00.0
Transmissions and receptions (last 1 sec):
02:00.0 TXDGPC (TX packets) 433,968 GOTCL (TX octets)   240,418,272
02:00.0 RXNFGPC (RX packets)    7,205   GORCL (RX octets)   3,991,570
Transmissions and receptions (last 1 sec):
02:00.0 TXDGPC (TX packets) 1,848,822   GOTCL (TX octets)   1,024,116,090
02:00.0 RXNFGPC (RX packets)    1,573,190   GORCL (RX octets)   871,551,692
Transmissions and receptions (last 1 sec):
02:00.0 TXDGPC (TX packets) 2,177,641   GOTCL (TX octets)   1,206,404,804
02:00.0 RXNFGPC (RX packets)    2,177,631   GORCL (RX octets)   1,206,408,128

Host 2

$ sudo ./snabb l2fwd -v 0000:02:00.0 0000:03:00.0
Report (last 1 sec):
link report:
              95,052 sent on nic1.tx -> nic2.rx (loss rate: 0%)
               2,944 sent on nic2.tx -> nic1.rx (loss rate: 0%)
load: time: 0.98s  fps: 100,419   fpGbps: 0.449 fpb: 6   bpp: 550  sleep: 35  us
Report (last 1 sec):
link report:
             942,990 sent on nic1.tx -> nic2.rx (loss rate: 0%)
             225,032 sent on nic2.tx -> nic1.rx (loss rate: 0%)
load: time: 1.00s  fps: 1,070,037 fpGbps: 4.785 fpb: 12  bpp: 550  sleep: 0   us
Report (last 1 sec):
link report:
           3,120,653 sent on nic1.tx -> nic2.rx (loss rate: 0%)
           2,402,703 sent on nic2.tx -> nic1.rx (loss rate: 0%)
load: time: 1.00s  fps: 4,355,314 fpGbps: 19.477 fpb: 22  bpp: 550  sleep: 0   us
Report (last 1 sec):

I also tested the l2fwd app running within a guest using the virtionet driver. It works fine too.

@dpino
Copy link
Contributor Author

dpino commented Feb 26, 2016

Apparently, I don't see any reason why CI failed cc @eugeneia

- Remove reset() method in LoadGen.
- Remove unnecessary prefix 'pci'.
@eugeneia
Copy link
Member

Non-blocking questions:

  1. I think I previously missed the fact that l2fwd operates on two duplex ports. Unless I am missing something l2fwd is still fundamentally incompatible with the packetblaster->snabbnfv<->vm(snabb+l2fwd) scenario since l2fwd needs two duplex ports but the VM only has one (vhost_user)? This is not a problem with this PR, just want to clear up my understanding.
  2. To be honest I do not think I really understand the motivation behind l2fwd/bounce. I see no other use case for bounce than in conjunction with l2fwd, and to me it seems the scope of l2fwd is very limited: to test/benchmark two duplex ports. Why two? Why not test each NIC individually? Why are bounce and l2fwd separate programs when they only work together (my assumption)?

function LoadGen:new (conf)
local o = { pciaddr= conf.pciaddr,
dev = intel10g.new_sf({pciaddr=conf.pciaddr}),
report_rx = conf.report_rx, }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly do the changes to LoadGen do? Does the LoadGen documentation need an update to reflect conf.report_rx?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified LoadGen so it reports on received packets as well as transmitted packets. As traffic gets bounced to its originator and the NIC is locked by the program running the packetblaster, this was the only possible way I thought of of checking traffic gets actually bounced back.

I missed updating LoadGen docs, if the change finally makes it I will update the docs.

@dpino
Copy link
Contributor Author

dpino commented Feb 29, 2016

@eugeneia My understanding after reading http://dpdk.org/doc/guides/sample_app_ug/l2_forward_real_virtual.html is:

  • There's a traffic originator running in one side (host A).
  • Packets reach a destination host (host B) and get forwarded between two ports on the same host. That's what the l2fwd app does AFAIU.
  • Packets finally reach host A on a different port and go back the same way.

Host A actually needs two packet generators, an app that creates synthetic packets as the packetblaster does and something that will bounce traffic. Checking whether packets are being received in the packetblaster originator app tells traffic has gone all the way through the links.

With regard to the packetblaster-bounce/l2fwd tandem, I agree with your comment. The point is that as the traffic generators run in a host independent of the host that has created the port forwarder, l2fwd/bounce cannot be on the same app.

OTOH, packetblaster-bounce (NIC1 <-> NIC2) can be used to measure throughput of NIC2, received packets and egress link load towards NIC1. Currently it doesn't print those reports but that would be easy to fix.

cc @lukego @wingo @nnikolaev-virtualopensystems

@eugeneia
Copy link
Member

eugeneia commented Mar 1, 2016

I would like other opinions on this as well. I have a very different picture of what l2fwd should do (e.g. run on a single NIC, receive on rx, swap ethernet src and dst, and send them back on tx), and this would imho simplify it while obsoleting bounce altogether. Note that this view is very centered on my own contact with DPDK's l2fwd and might be completely uneducated/ignorant.

@dpino
Copy link
Contributor Author

dpino commented Mar 3, 2016

@Eugenia Totally understand, I also want to hear other opinions. With regard to MAC addresses and swapping them, I forgot to mention that I obvious that part because it's not necessary. NICs when instantiated as SingleFunction, which is the case of packetblaster, work in promiscuous mode.

@ghost
Copy link

ghost commented Mar 3, 2016

As explained before once you have a HW switch between the l2fwd and the packet blaster then you need MAC swapping. I understand that this is not your use case, but that is what a forwarder should do - otherwise it is just a bridge, or softwire or something else but not L2FWD IMHO.

@eugeneia
Copy link
Member

eugeneia commented Mar 3, 2016

@dpino Would the model I described cover your use case as well?

@dpino
Copy link
Contributor Author

dpino commented Mar 4, 2016

@Eugenia I'm not a user of DPDK's l2fwd app so it's hard for me to tell how much of this app can replace it. If the use case you described is enough for ditching DPDK's l2fwd in favour of this app I can implement it that way. I see your description more like the bouncer mode I added to packetblaster, but I can put that functionality in its own app. It will be handy too for me to test a VirtioNet link. I think I will create a brand new PR.

@eugeneia
Copy link
Member

eugeneia commented Mar 7, 2016

@dpino We use l2fwd to test a duplex port, with the slight ugliness that our “softwire“ port (Snabb NFV) does the benchmarking. What if packetblaster was extended to measure incoming bandwidth, and l2fwd behaved like in DPDK? Then we could set up the following topology:

packetblaster<->(virtio|NIC|snabbnfv)<->l2wfd)

Then we could benchmark/test any duplex “softwire” with packetblaster, given l2fwd sits on the other end. We would have two orthogonal parts: a benchmarking tool packetblaster and a simple forwarder l2fwd that would hopefully be generic enough to serve a bulk of benchmarking scenarios.

@dpino
Copy link
Contributor Author

dpino commented May 9, 2016

@eugeneia I'm retaking this task. Your suggestion sounded good to me, however there's something I don't understand yet and it's how DPDK's l2fwd is currently being used.

AFAIU, DPDK is only used in the packetblaster benchmark test. The test uses qemu-dpdk.img VM which has DPDK on it.

For testing purposes, I replaced qemu-dpdk.img in the packetblaster benchmark for qemu.img. The benchmark works but it's much slower. I was thinking about why it was working and my conclusion is that when SnabbNFV runs on its most basic form all the traffic that is received on the VM is sent back to the host because the configuration goes like this:

      config.link(c, NIC..".tx -> "..VM_rx)
      config.link(c, VM_tx.." -> "..NIC..".rx")

The packets are received on the NIC running packetblaster as both cards are wired together. Is this the reason why the packets in a VM are sent back to the host and traffic can be benchmarked?

As for the qemu-dpdk.img scenario I understand the benchmark goes faster as DPDK skips the VM kernel, what I don't see is how DPDK or DPDK's l2fwd is run or who starts it :/ I grepped for "l2fwd" in the code base and I didn't find any results. OTOH, I see the source code of DPDK is in the /root folder, although I'm not sure if it's built as the build folder is empty.

@eugeneia
Copy link
Member

qemu-dpdk.img contains the following /etc/rc.local:

#!/bin/sh
mount -t hugetlbfs nodev /hugetlbfs
echo 64 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
modprobe uio
insmod /root/dpdk/x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
/root/dpdk/tools/dpdk_nic_bind.py --bind=igb_uio 00:03.0
screen -d -m /root/dpdk/examples/l2fwd/x86_64-native-linuxapp-gcc/l2fwd -c 0x1 -n1 -- -p 0x1
exit 0

I can't really tell you what the cryptic l2fwd invocation means though.^^

@dpino
Copy link
Contributor Author

dpino commented May 13, 2016

@eugeneia Thanks, I was missing that part :)

@wingo
Copy link
Contributor

wingo commented Oct 19, 2016

It would be really great to have this app :) Right now we are landing changes to the virtio-net interface without the benefit of CI or benchmarking, at least in core; sub optimal!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants