bpf: New approach for BPF MTU handling and enforcement #179

kernel-patches-bot · 2020-10-06T16:05:06Z

Pull request for series with
subject: bpf: New approach for BPF MTU handling and enforcement
version: 1
url: https://patchwork.kernel.org/project/bpf/list/?series=360359

kernel-patches-bot · 2020-10-06T16:05:07Z

Master branch: dca4121
series: https://patchwork.kernel.org/project/bpf/list/?series=360359
version: 1

kernel-patches-bot · 2020-10-06T18:36:11Z

Master branch: fd08f94
series: https://patchwork.kernel.org/project/bpf/list/?series=360359
version: 1

Multiple BPF-helpers that can manipulate/increase the size of the SKB uses __bpf_skb_max_len() as the max-length. This function limit size against the current net_device MTU (skb->dev->mtu). When a BPF-prog grow the packet size, then it should not be limited to the MTU. The MTU is a transmit limitation, and software receiving this packet should be allowed to increase the size. Further more, current MTU check in __bpf_skb_max_len uses the MTU from ingress/current net_device, which in case of redirects uses the wrong net_device. Keep a sanity max limit of IP_MAX_MTU which is 64KiB. In later patches we will enforce the MTU limitation when transmitting packets. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> (imported from commit 37f8552786cf46588af52b77829b730dd14524d3)

The BPF-helpers for FIB lookup (bpf_xdp_fib_lookup and bpf_skb_fib_lookup) can perform MTU check and return BPF_FIB_LKUP_RET_FRAG_NEEDED. The BPF-prog don't know the MTU value that caused this rejection. If the BPF-prog wants to implement PMTU (Path MTU Discovery) (rfc1191) it need to know this MTU value for the ICMP packet. Patch change lookup and result struct bpf_fib_lookup, to contain this MTU value as output via a union with 'tot_len' as this is the value used for the MTU lookup. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>

FIXME: add description. FIXME: IMHO we can create a better BPF-helper named bpf_mtu_check() instead of bpf_mtu_lookup(), because a flag can be used for requesting GRO segment size checking. The ret value of bpf_mtu_check() says if MTU was violoated, but also return MTU via pointer arg to allow BPF-progs to do own logic. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>

This change makes it possible to identify SKBs that have been redirected by TC-BPF (cls_act). This is needed for a number of cases. (1) For collaborating with driver ifb net_devices. (2) For avoiding starting generic-XDP prog on TC ingress redirect. (3) Next MTU check patches need ability to identify redirected SKBs. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>

The MTU should only apply to transmitted packets. When TC-ingress redirect packet to egress on another netdev, then the normal netstack MTU checks are skipped (and driver level will not catch any MTU violation, checked ixgbe). This patch choose not to add MTU check in the egress code path of skb_do_redirect() prior to calling dev_queue_xmit(), because it is still possible to run another BPF egress program that will shrink/consume headers, which will make packet comply with netdev MTU. This use-case might already be in production use (if ingress MTU is larger than egress). Instead do the MTU check after sch_handle_egress() step, for the cases that require this. The cases need a bit explaining. Ingress to egress redirected packets could be detected via skb->tc_at_ingress bit, but it is not reliable, because sch_handle_egress() could steal the packet and redirect this (again) to another egress netdev, which will then have the skb->tc_at_ingress cleared. There is also the case of TC-egress prog increase packet size and then redirect it egress. Thus, it is more reliable to do the MTU check for any redirected packet (both ingress and egress), which is available via skb_is_redirected() in earlier patch. Also handle case where egress BPF-prog increased size. One advantage of this approach is that it ingress-to-egress BPF-prog can send information via packet data. With the MTU checks removed in the helpers, and also not done in skb_do_redirect() call, this allows for an ingress BPF-prog to communicate with an egress BPF-prog via packet data, as long as egress BPF-prog remove this prior to transmitting packet. Troubleshooting: MTU violations are recorded in TX dropped counter, and kprobe on dev_queue_xmit() have retval -EMSGSIZE. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>

The use-case for dropping the MTU check when TC-BPF ingress redirecting a packet, is described by Eyal Birger in email[0]. The summary is the ability to increase packet size (e.g. with IPv6 headers for NAT64) and ingress redirect packet and let normal netstack fragment packet as needed. [0] https://lore.kernel.org/netdev/CAHsH6Gug-hsLGHQ6N0wtixdOa85LDZ3HNRHVd0opR=19Qo4W4Q@mail.gmail.com/ Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>

kernel-patches-bot · 2020-10-06T18:49:46Z

Master branch: 67ed375
series: https://patchwork.kernel.org/project/bpf/list/?series=360359
version: 1

Add a big batch of selftest to extend test_progs with various tc link, attach ops and old-style tc BPF attachments via libbpf APIs. Also test multi-program attachments including mixing the various attach options: # ./test_progs -t tc_link #179 tc_link_base:OK #180 tc_link_detach:OK #181 tc_link_mix:OK #182 tc_link_opts:OK #183 tc_link_run_base:OK #184 tc_link_run_chain:OK Summary: 6/0 PASSED, 0 SKIPPED, 0 FAILED All new and existing test cases pass. Co-developed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

kernel-patches-bot added bpf-next new V1 labels Oct 6, 2020

kernel-patches-bot force-pushed the series/360359=>bpf-next branch from d5e532a to 4317d5c Compare October 6, 2020 18:36

kernel-patches-bot and others added 7 commits October 6, 2020 11:49

adding ci files

586ae53

kernel-patches-bot force-pushed the series/360359=>bpf-next branch from 4317d5c to ae67b96 Compare October 6, 2020 18:49

kernel-patches-bot closed this Oct 7, 2020

kernel-patches-bot deleted the series/360359=>bpf-next branch October 7, 2020 01:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpf: New approach for BPF MTU handling and enforcement #179

bpf: New approach for BPF MTU handling and enforcement #179

kernel-patches-bot commented Oct 6, 2020

kernel-patches-bot commented Oct 6, 2020

kernel-patches-bot commented Oct 6, 2020

kernel-patches-bot commented Oct 6, 2020

bpf: New approach for BPF MTU handling and enforcement #179

bpf: New approach for BPF MTU handling and enforcement #179

Conversation

kernel-patches-bot commented Oct 6, 2020

kernel-patches-bot commented Oct 6, 2020

kernel-patches-bot commented Oct 6, 2020

kernel-patches-bot commented Oct 6, 2020