-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpf: New approach for BPF MTU handling and enforcement #179
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Master branch: dca4121 |
Master branch: fd08f94 |
kernel-patches-bot
force-pushed
the
series/360359=>bpf-next
branch
from
October 6, 2020 18:36
d5e532a
to
4317d5c
Compare
Multiple BPF-helpers that can manipulate/increase the size of the SKB uses __bpf_skb_max_len() as the max-length. This function limit size against the current net_device MTU (skb->dev->mtu). When a BPF-prog grow the packet size, then it should not be limited to the MTU. The MTU is a transmit limitation, and software receiving this packet should be allowed to increase the size. Further more, current MTU check in __bpf_skb_max_len uses the MTU from ingress/current net_device, which in case of redirects uses the wrong net_device. Keep a sanity max limit of IP_MAX_MTU which is 64KiB. In later patches we will enforce the MTU limitation when transmitting packets. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> (imported from commit 37f8552786cf46588af52b77829b730dd14524d3)
The BPF-helpers for FIB lookup (bpf_xdp_fib_lookup and bpf_skb_fib_lookup) can perform MTU check and return BPF_FIB_LKUP_RET_FRAG_NEEDED. The BPF-prog don't know the MTU value that caused this rejection. If the BPF-prog wants to implement PMTU (Path MTU Discovery) (rfc1191) it need to know this MTU value for the ICMP packet. Patch change lookup and result struct bpf_fib_lookup, to contain this MTU value as output via a union with 'tot_len' as this is the value used for the MTU lookup. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
FIXME: add description. FIXME: IMHO we can create a better BPF-helper named bpf_mtu_check() instead of bpf_mtu_lookup(), because a flag can be used for requesting GRO segment size checking. The ret value of bpf_mtu_check() says if MTU was violoated, but also return MTU via pointer arg to allow BPF-progs to do own logic. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
This change makes it possible to identify SKBs that have been redirected by TC-BPF (cls_act). This is needed for a number of cases. (1) For collaborating with driver ifb net_devices. (2) For avoiding starting generic-XDP prog on TC ingress redirect. (3) Next MTU check patches need ability to identify redirected SKBs. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
The MTU should only apply to transmitted packets. When TC-ingress redirect packet to egress on another netdev, then the normal netstack MTU checks are skipped (and driver level will not catch any MTU violation, checked ixgbe). This patch choose not to add MTU check in the egress code path of skb_do_redirect() prior to calling dev_queue_xmit(), because it is still possible to run another BPF egress program that will shrink/consume headers, which will make packet comply with netdev MTU. This use-case might already be in production use (if ingress MTU is larger than egress). Instead do the MTU check after sch_handle_egress() step, for the cases that require this. The cases need a bit explaining. Ingress to egress redirected packets could be detected via skb->tc_at_ingress bit, but it is not reliable, because sch_handle_egress() could steal the packet and redirect this (again) to another egress netdev, which will then have the skb->tc_at_ingress cleared. There is also the case of TC-egress prog increase packet size and then redirect it egress. Thus, it is more reliable to do the MTU check for any redirected packet (both ingress and egress), which is available via skb_is_redirected() in earlier patch. Also handle case where egress BPF-prog increased size. One advantage of this approach is that it ingress-to-egress BPF-prog can send information via packet data. With the MTU checks removed in the helpers, and also not done in skb_do_redirect() call, this allows for an ingress BPF-prog to communicate with an egress BPF-prog via packet data, as long as egress BPF-prog remove this prior to transmitting packet. Troubleshooting: MTU violations are recorded in TX dropped counter, and kprobe on dev_queue_xmit() have retval -EMSGSIZE. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
The use-case for dropping the MTU check when TC-BPF ingress redirecting a packet, is described by Eyal Birger in email[0]. The summary is the ability to increase packet size (e.g. with IPv6 headers for NAT64) and ingress redirect packet and let normal netstack fragment packet as needed. [0] https://lore.kernel.org/netdev/CAHsH6Gug-hsLGHQ6N0wtixdOa85LDZ3HNRHVd0opR=19Qo4W4Q@mail.gmail.com/ Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Master branch: 67ed375 |
kernel-patches-bot
force-pushed
the
series/360359=>bpf-next
branch
from
October 6, 2020 18:49
4317d5c
to
ae67b96
Compare
kernel-patches-bot
pushed a commit
that referenced
this pull request
Oct 4, 2022
Add a big batch of selftest to extend test_progs with various tc link, attach ops and old-style tc BPF attachments via libbpf APIs. Also test multi-program attachments including mixing the various attach options: # ./test_progs -t tc_link #179 tc_link_base:OK #180 tc_link_detach:OK #181 tc_link_mix:OK #182 tc_link_opts:OK #183 tc_link_run_base:OK #184 tc_link_run_chain:OK Summary: 6/0 PASSED, 0 SKIPPED, 0 FAILED All new and existing test cases pass. Co-developed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull request for series with
subject: bpf: New approach for BPF MTU handling and enforcement
version: 1
url: https://patchwork.kernel.org/project/bpf/list/?series=360359