-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xsk: i40e: Tx performance improvements #347
Conversation
Master branch: 024cd2c |
Master branch: b93ef08 |
4ee7040
to
6358fb5
Compare
Increment the statistics over how many Tx packets have been sent at the time of sending instead of at the time of completion. This as a completion event means that the buffer has been sent AND returned to user space. The packet always gets sent shortly after sendto() is called. The kernel might, for performance reasons, decide to not return every single buffer to user space immediately after sending, for example, only after a batch of packets have been transmitted. Incrementing the number of packets sent at completion, will in that case be confusing as if you send a single packet, the counter might show zero for a while even though the packet has been transmitted. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: John Fastabend <john.fastabend@gmail.com>
Remove the unnecessary access to the software ring for the AF_XDP zero-copy driver. This was used to record the length of the packet so that the driver Tx completion code could sum this up to produce the total bytes sent. This is now performed during the transmission of the packet, so no need to record this in the software ring. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: John Fastabend <john.fastabend@gmail.com>
Introduce one cache line worth of padding between the consumer pointer and the flags field as well as between the flags field and the start of the descriptors in all the lockless rings. This so that the x86 HW adjacency prefetcher will not prefetch the adjacent pointer/field when only one pointer/field is going to be used. This improves throughput performance for the l2fwd sample app with 1% on my machine with HW prefetching turned on in the BIOS. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: John Fastabend <john.fastabend@gmail.com>
Introduce batched descriptor interfaces in the xsk core code for the Tx path to be used in the driver to write a code path with higher performance. This interface will be used by the i40e driver in the next patch. Though other drivers would likely benefit from this new interface too. Note that batching is only implemented for the common case when there is only one socket bound to the same device and queue id. When this is not the case, we fall back to the old non-batched version of the function. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Use the new batched xsk interfaces for the Tx path in the i40e driver to improve performance. On my machine, this yields a throughput increase of 4% for the l2fwd sample app in xdpsock. If we instead just look at the Tx part, this patch set increases throughput with above 20% for Tx. Note that I had to explicitly loop unroll the inner loop to get to this performance level, by using a pragma. It is honored by both clang and gcc and should be ignored by versions that do not support it. Using the -funroll-loops compiler command line switch on the source file resulted in a loop unrolling on a higher level that lead to a performance decrease instead of an increase. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: John Fastabend <john.fastabend@gmail.com>
Master branch: de91e63 |
6358fb5
to
7da5544
Compare
Master branch: cbf398d Pull request is NOT updated. Failed to apply https://patchwork.kernel.org/project/netdevbpf/list/?series=384963
conflict:
|
At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=384963 irrelevant now. Closing PR. |
Enable CPU v4 instruction tests for arm64. Below are the test results from BPF test_progs selftests: # ./test_progs -t ldsx_insn,verifier_sdiv,verifier_movsx,verifier_ldsx,verifier_gotol,verifier_bswap #115/1 ldsx_insn/map_val and probed_memory:OK #115/2 ldsx_insn/ctx_member_sign_ext:OK #115/3 ldsx_insn/ctx_member_narrow_sign_ext:OK #115 ldsx_insn:OK #302/1 verifier_bswap/BSWAP, 16:OK #302/2 verifier_bswap/BSWAP, 16 @unpriv:OK #302/3 verifier_bswap/BSWAP, 32:OK #302/4 verifier_bswap/BSWAP, 32 @unpriv:OK #302/5 verifier_bswap/BSWAP, 64:OK #302/6 verifier_bswap/BSWAP, 64 @unpriv:OK #302 verifier_bswap:OK #316/1 verifier_gotol/gotol, small_imm:OK #316/2 verifier_gotol/gotol, small_imm @unpriv:OK #316 verifier_gotol:OK #324/1 verifier_ldsx/LDSX, S8:OK #324/2 verifier_ldsx/LDSX, S8 @unpriv:OK #324/3 verifier_ldsx/LDSX, S16:OK #324/4 verifier_ldsx/LDSX, S16 @unpriv:OK #324/5 verifier_ldsx/LDSX, S32:OK #324/6 verifier_ldsx/LDSX, S32 @unpriv:OK #324/7 verifier_ldsx/LDSX, S8 range checking, privileged:OK #324/8 verifier_ldsx/LDSX, S16 range checking:OK #324/9 verifier_ldsx/LDSX, S16 range checking @unpriv:OK #324/10 verifier_ldsx/LDSX, S32 range checking:OK #324/11 verifier_ldsx/LDSX, S32 range checking @unpriv:OK #324 verifier_ldsx:OK #335/1 verifier_movsx/MOV32SX, S8:OK #335/2 verifier_movsx/MOV32SX, S8 @unpriv:OK #335/3 verifier_movsx/MOV32SX, S16:OK #335/4 verifier_movsx/MOV32SX, S16 @unpriv:OK #335/5 verifier_movsx/MOV64SX, S8:OK #335/6 verifier_movsx/MOV64SX, S8 @unpriv:OK #335/7 verifier_movsx/MOV64SX, S16:OK #335/8 verifier_movsx/MOV64SX, S16 @unpriv:OK #335/9 verifier_movsx/MOV64SX, S32:OK #335/10 verifier_movsx/MOV64SX, S32 @unpriv:OK #335/11 verifier_movsx/MOV32SX, S8, range_check:OK #335/12 verifier_movsx/MOV32SX, S8, range_check @unpriv:OK #335/13 verifier_movsx/MOV32SX, S16, range_check:OK #335/14 verifier_movsx/MOV32SX, S16, range_check @unpriv:OK #335/15 verifier_movsx/MOV32SX, S16, range_check 2:OK #335/16 verifier_movsx/MOV32SX, S16, range_check 2 @unpriv:OK #335/17 verifier_movsx/MOV64SX, S8, range_check:OK #335/18 verifier_movsx/MOV64SX, S8, range_check @unpriv:OK #335/19 verifier_movsx/MOV64SX, S16, range_check:OK #335/20 verifier_movsx/MOV64SX, S16, range_check @unpriv:OK #335/21 verifier_movsx/MOV64SX, S32, range_check:OK #335/22 verifier_movsx/MOV64SX, S32, range_check @unpriv:OK #335/23 verifier_movsx/MOV64SX, S16, R10 Sign Extension:OK #335/24 verifier_movsx/MOV64SX, S16, R10 Sign Extension @unpriv:OK #335 verifier_movsx:OK #347/1 verifier_sdiv/SDIV32, non-zero imm divisor, check 1:OK #347/2 verifier_sdiv/SDIV32, non-zero imm divisor, check 1 @unpriv:OK #347/3 verifier_sdiv/SDIV32, non-zero imm divisor, check 2:OK #347/4 verifier_sdiv/SDIV32, non-zero imm divisor, check 2 @unpriv:OK #347/5 verifier_sdiv/SDIV32, non-zero imm divisor, check 3:OK #347/6 verifier_sdiv/SDIV32, non-zero imm divisor, check 3 @unpriv:OK #347/7 verifier_sdiv/SDIV32, non-zero imm divisor, check 4:OK #347/8 verifier_sdiv/SDIV32, non-zero imm divisor, check 4 @unpriv:OK #347/9 verifier_sdiv/SDIV32, non-zero imm divisor, check 5:OK #347/10 verifier_sdiv/SDIV32, non-zero imm divisor, check 5 @unpriv:OK #347/11 verifier_sdiv/SDIV32, non-zero imm divisor, check 6:OK #347/12 verifier_sdiv/SDIV32, non-zero imm divisor, check 6 @unpriv:OK #347/13 verifier_sdiv/SDIV32, non-zero imm divisor, check 7:OK #347/14 verifier_sdiv/SDIV32, non-zero imm divisor, check 7 @unpriv:OK #347/15 verifier_sdiv/SDIV32, non-zero imm divisor, check 8:OK #347/16 verifier_sdiv/SDIV32, non-zero imm divisor, check 8 @unpriv:OK #347/17 verifier_sdiv/SDIV32, non-zero reg divisor, check 1:OK #347/18 verifier_sdiv/SDIV32, non-zero reg divisor, check 1 @unpriv:OK #347/19 verifier_sdiv/SDIV32, non-zero reg divisor, check 2:OK #347/20 verifier_sdiv/SDIV32, non-zero reg divisor, check 2 @unpriv:OK #347/21 verifier_sdiv/SDIV32, non-zero reg divisor, check 3:OK #347/22 verifier_sdiv/SDIV32, non-zero reg divisor, check 3 @unpriv:OK #347/23 verifier_sdiv/SDIV32, non-zero reg divisor, check 4:OK #347/24 verifier_sdiv/SDIV32, non-zero reg divisor, check 4 @unpriv:OK #347/25 verifier_sdiv/SDIV32, non-zero reg divisor, check 5:OK #347/26 verifier_sdiv/SDIV32, non-zero reg divisor, check 5 @unpriv:OK #347/27 verifier_sdiv/SDIV32, non-zero reg divisor, check 6:OK #347/28 verifier_sdiv/SDIV32, non-zero reg divisor, check 6 @unpriv:OK #347/29 verifier_sdiv/SDIV32, non-zero reg divisor, check 7:OK #347/30 verifier_sdiv/SDIV32, non-zero reg divisor, check 7 @unpriv:OK #347/31 verifier_sdiv/SDIV32, non-zero reg divisor, check 8:OK #347/32 verifier_sdiv/SDIV32, non-zero reg divisor, check 8 @unpriv:OK #347/33 verifier_sdiv/SDIV64, non-zero imm divisor, check 1:OK #347/34 verifier_sdiv/SDIV64, non-zero imm divisor, check 1 @unpriv:OK #347/35 verifier_sdiv/SDIV64, non-zero imm divisor, check 2:OK #347/36 verifier_sdiv/SDIV64, non-zero imm divisor, check 2 @unpriv:OK #347/37 verifier_sdiv/SDIV64, non-zero imm divisor, check 3:OK #347/38 verifier_sdiv/SDIV64, non-zero imm divisor, check 3 @unpriv:OK #347/39 verifier_sdiv/SDIV64, non-zero imm divisor, check 4:OK #347/40 verifier_sdiv/SDIV64, non-zero imm divisor, check 4 @unpriv:OK #347/41 verifier_sdiv/SDIV64, non-zero imm divisor, check 5:OK #347/42 verifier_sdiv/SDIV64, non-zero imm divisor, check 5 @unpriv:OK #347/43 verifier_sdiv/SDIV64, non-zero imm divisor, check 6:OK #347/44 verifier_sdiv/SDIV64, non-zero imm divisor, check 6 @unpriv:OK #347/45 verifier_sdiv/SDIV64, non-zero reg divisor, check 1:OK #347/46 verifier_sdiv/SDIV64, non-zero reg divisor, check 1 @unpriv:OK #347/47 verifier_sdiv/SDIV64, non-zero reg divisor, check 2:OK #347/48 verifier_sdiv/SDIV64, non-zero reg divisor, check 2 @unpriv:OK #347/49 verifier_sdiv/SDIV64, non-zero reg divisor, check 3:OK #347/50 verifier_sdiv/SDIV64, non-zero reg divisor, check 3 @unpriv:OK #347/51 verifier_sdiv/SDIV64, non-zero reg divisor, check 4:OK #347/52 verifier_sdiv/SDIV64, non-zero reg divisor, check 4 @unpriv:OK #347/53 verifier_sdiv/SDIV64, non-zero reg divisor, check 5:OK #347/54 verifier_sdiv/SDIV64, non-zero reg divisor, check 5 @unpriv:OK #347/55 verifier_sdiv/SDIV64, non-zero reg divisor, check 6:OK #347/56 verifier_sdiv/SDIV64, non-zero reg divisor, check 6 @unpriv:OK #347/57 verifier_sdiv/SMOD32, non-zero imm divisor, check 1:OK #347/58 verifier_sdiv/SMOD32, non-zero imm divisor, check 1 @unpriv:OK #347/59 verifier_sdiv/SMOD32, non-zero imm divisor, check 2:OK #347/60 verifier_sdiv/SMOD32, non-zero imm divisor, check 2 @unpriv:OK #347/61 verifier_sdiv/SMOD32, non-zero imm divisor, check 3:OK #347/62 verifier_sdiv/SMOD32, non-zero imm divisor, check 3 @unpriv:OK #347/63 verifier_sdiv/SMOD32, non-zero imm divisor, check 4:OK #347/64 verifier_sdiv/SMOD32, non-zero imm divisor, check 4 @unpriv:OK #347/65 verifier_sdiv/SMOD32, non-zero imm divisor, check 5:OK #347/66 verifier_sdiv/SMOD32, non-zero imm divisor, check 5 @unpriv:OK #347/67 verifier_sdiv/SMOD32, non-zero imm divisor, check 6:OK #347/68 verifier_sdiv/SMOD32, non-zero imm divisor, check 6 @unpriv:OK #347/69 verifier_sdiv/SMOD32, non-zero reg divisor, check 1:OK #347/70 verifier_sdiv/SMOD32, non-zero reg divisor, check 1 @unpriv:OK #347/71 verifier_sdiv/SMOD32, non-zero reg divisor, check 2:OK #347/72 verifier_sdiv/SMOD32, non-zero reg divisor, check 2 @unpriv:OK #347/73 verifier_sdiv/SMOD32, non-zero reg divisor, check 3:OK #347/74 verifier_sdiv/SMOD32, non-zero reg divisor, check 3 @unpriv:OK #347/75 verifier_sdiv/SMOD32, non-zero reg divisor, check 4:OK #347/76 verifier_sdiv/SMOD32, non-zero reg divisor, check 4 @unpriv:OK #347/77 verifier_sdiv/SMOD32, non-zero reg divisor, check 5:OK #347/78 verifier_sdiv/SMOD32, non-zero reg divisor, check 5 @unpriv:OK #347/79 verifier_sdiv/SMOD32, non-zero reg divisor, check 6:OK #347/80 verifier_sdiv/SMOD32, non-zero reg divisor, check 6 @unpriv:OK #347/81 verifier_sdiv/SMOD64, non-zero imm divisor, check 1:OK #347/82 verifier_sdiv/SMOD64, non-zero imm divisor, check 1 @unpriv:OK #347/83 verifier_sdiv/SMOD64, non-zero imm divisor, check 2:OK #347/84 verifier_sdiv/SMOD64, non-zero imm divisor, check 2 @unpriv:OK #347/85 verifier_sdiv/SMOD64, non-zero imm divisor, check 3:OK #347/86 verifier_sdiv/SMOD64, non-zero imm divisor, check 3 @unpriv:OK #347/87 verifier_sdiv/SMOD64, non-zero imm divisor, check 4:OK #347/88 verifier_sdiv/SMOD64, non-zero imm divisor, check 4 @unpriv:OK #347/89 verifier_sdiv/SMOD64, non-zero imm divisor, check 5:OK #347/90 verifier_sdiv/SMOD64, non-zero imm divisor, check 5 @unpriv:OK #347/91 verifier_sdiv/SMOD64, non-zero imm divisor, check 6:OK #347/92 verifier_sdiv/SMOD64, non-zero imm divisor, check 6 @unpriv:OK #347/93 verifier_sdiv/SMOD64, non-zero imm divisor, check 7:OK #347/94 verifier_sdiv/SMOD64, non-zero imm divisor, check 7 @unpriv:OK #347/95 verifier_sdiv/SMOD64, non-zero imm divisor, check 8:OK #347/96 verifier_sdiv/SMOD64, non-zero imm divisor, check 8 @unpriv:OK #347/97 verifier_sdiv/SMOD64, non-zero reg divisor, check 1:OK #347/98 verifier_sdiv/SMOD64, non-zero reg divisor, check 1 @unpriv:OK #347/99 verifier_sdiv/SMOD64, non-zero reg divisor, check 2:OK #347/100 verifier_sdiv/SMOD64, non-zero reg divisor, check 2 @unpriv:OK #347/101 verifier_sdiv/SMOD64, non-zero reg divisor, check 3:OK #347/102 verifier_sdiv/SMOD64, non-zero reg divisor, check 3 @unpriv:OK #347/103 verifier_sdiv/SMOD64, non-zero reg divisor, check 4:OK #347/104 verifier_sdiv/SMOD64, non-zero reg divisor, check 4 @unpriv:OK #347/105 verifier_sdiv/SMOD64, non-zero reg divisor, check 5:OK #347/106 verifier_sdiv/SMOD64, non-zero reg divisor, check 5 @unpriv:OK #347/107 verifier_sdiv/SMOD64, non-zero reg divisor, check 6:OK #347/108 verifier_sdiv/SMOD64, non-zero reg divisor, check 6 @unpriv:OK #347/109 verifier_sdiv/SMOD64, non-zero reg divisor, check 7:OK #347/110 verifier_sdiv/SMOD64, non-zero reg divisor, check 7 @unpriv:OK #347/111 verifier_sdiv/SMOD64, non-zero reg divisor, check 8:OK #347/112 verifier_sdiv/SMOD64, non-zero reg divisor, check 8 @unpriv:OK #347/113 verifier_sdiv/SDIV32, zero divisor:OK #347/114 verifier_sdiv/SDIV32, zero divisor @unpriv:OK #347/115 verifier_sdiv/SDIV64, zero divisor:OK #347/116 verifier_sdiv/SDIV64, zero divisor @unpriv:OK #347/117 verifier_sdiv/SMOD32, zero divisor:OK #347/118 verifier_sdiv/SMOD32, zero divisor @unpriv:OK #347/119 verifier_sdiv/SMOD64, zero divisor:OK #347/120 verifier_sdiv/SMOD64, zero divisor @unpriv:OK #347 verifier_sdiv:OK Summary: 6/166 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Florent Revest <revest@chromium.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Florent Revest <revest@chromium.org> Link: https://lore.kernel.org/bpf/20230815154158.717901-8-xukuohai@huaweicloud.com
Add a test case to assert that the skb->pkt_type which was set from the BPF program is retained from the netkit xmit side to the peer's device at tcx ingress location. # ./vmtest.sh -- ./test_progs -t netkit [...] ./test_progs -t netkit [ 1.140780] bpf_testmod: loading out-of-tree module taints kernel. [ 1.141127] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel [ 1.284601] tsc: Refined TSC clocksource calibration: 3408.006 MHz [ 1.286672] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd9b189d, max_idle_ns: 440795225691 ns [ 1.290384] clocksource: Switched to clocksource tsc #345 tc_netkit_basic:OK #346 tc_netkit_device:OK #347 tc_netkit_multi_links:OK #348 tc_netkit_multi_opts:OK #349 tc_netkit_neigh_links:OK #350 tc_netkit_pkt_type:OK Summary: 6/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add a test case to assert that the skb->pkt_type which was set from the BPF program is retained from the netkit xmit side to the peer's device at tcx ingress location. # ./vmtest.sh -- ./test_progs -t netkit [...] ./test_progs -t netkit [ 1.140780] bpf_testmod: loading out-of-tree module taints kernel. [ 1.141127] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel [ 1.284601] tsc: Refined TSC clocksource calibration: 3408.006 MHz [ 1.286672] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd9b189d, max_idle_ns: 440795225691 ns [ 1.290384] clocksource: Switched to clocksource tsc #345 tc_netkit_basic:OK #346 tc_netkit_device:OK #347 tc_netkit_multi_links:OK #348 tc_netkit_multi_opts:OK #349 tc_netkit_neigh_links:OK #350 tc_netkit_pkt_type:OK Summary: 6/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add a test case to assert that the skb->pkt_type which was set from the BPF program is retained from the netkit xmit side to the peer's device at tcx ingress location. # ./vmtest.sh -- ./test_progs -t netkit [...] ./test_progs -t netkit [ 1.140780] bpf_testmod: loading out-of-tree module taints kernel. [ 1.141127] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel [ 1.284601] tsc: Refined TSC clocksource calibration: 3408.006 MHz [ 1.286672] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd9b189d, max_idle_ns: 440795225691 ns [ 1.290384] clocksource: Switched to clocksource tsc #345 tc_netkit_basic:OK #346 tc_netkit_device:OK #347 tc_netkit_multi_links:OK #348 tc_netkit_multi_opts:OK #349 tc_netkit_neigh_links:OK #350 tc_netkit_pkt_type:OK Summary: 6/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20240524163619.26001-4-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Pull request for series with
subject: xsk: i40e: Tx performance improvements
version: 3
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=384963