-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add eBPF connection tracking, without dependencies on kernel headers #2070
Add eBPF connection tracking, without dependencies on kernel headers #2070
Conversation
801dcfc
to
a84a81e
Compare
Perhaps include all the binaries in the scope container image? Yes, it adds bloat, but dynamically downloading binaries (on every launch?) is gross. |
@rade It is already in the Scope Docker image. It is done at build-time (implemented in Scope's Makefile). At the moment, the size of the compiled ebpf programs is very small, I think <50KB uncompressed. |
Ah, so when you wrote "Scope fetches" you didn't actually mean "Scope" but "The scope build". Also you wrote "Scope fetches the pre-compiled ebpf program" (note the singular). Taking into account your comments, I believe that sentence should have read "The scope build fetches the pre-compiled ebpf programs." Correct? |
I closed the two previous iterations of this work, in favor of this PR. For the record, this was:
|
Two graphs comparing the performance between master and this branch (by running our test dialer, #2082): With up to 4K connections (40 dialer container with 100 connections each): With a pool of 100 containers, each holding a random number of connections between 1 and 20: |
https://github.com/koalaman/shellcheck/wiki/SC2048 We get a https://github.com/koalaman/shellcheck/wiki/SC2016 but we do mean literal backquotes.
Based on work from Lorenzo, updated by Iago, Alban, and Alessandro This PR adds connection tracking using eBPF. This feature is not enabled by default. For now, you can enable it by launching scope with the following command: ``` sudo ./scope launch --probe.ebpf.connections=true ``` Scope Probe also falls back on the old /proc parsing if eBPF is not working (e.g. too old kernel, or pre-compiled eBPF program missing for the current kernel). This patch allows scope to get notified of every connection event, without relying on the parsing of /proc/$pid/net/tcp{,6} and /proc/$pid/fd/*, and therefore improve performance. We vendor https://github.com/kinvolk/gobpf-elf-loader in Scope to load and receive information from the ebpf program. Scope fetches the pre-compiled ebpf program from https://hub.docker.com/r/kinvolk/tcptracer-bpf/ (see https://github.com/kinvolk/tcptracer-bpf) At the moment, the eBPF program is pre-compiled for: - Arch 4.8.11-1 - fedora-24 - debian-testing - coreos The ebpf program uses kprobes on the following kernel functions: - tcp_v4_connect - tcp_v6_connect - inet_csk_accept - tcp_close It generates "connect", "accept" and "close" events containing the connection tuple but also pid and netns. Note: the IPv6 events are not plugged in Scope. probe/endpoint/ebpf.go maintains the list of connections. Similarly to conntrack, it also keeps the dead connections for one iteration in order to report short-lived connections. The code for parsing /proc/$pid/net/tcp{,6} and /proc/$pid/fd/* is still there and still used at start-up because eBPF only brings us the events and not the initial state. However, the /proc parsing for the initial state is now done in foreground instead of background, via newForegroundReader(). NAT resolution on connections from eBPF works in the same way as it did on connections from /proc: by using conntrack. One of the two conntrack instances was removed since eBPF detects short-lived connections. The Scope Docker image size comparison: - weaveworks/scope in current master: 22 MB (compressed), 68 MB (uncompressed) - weaveworks/scope with this patchset: 24 MB (compressed), 70 MB (uncompressed) Fixes weaveworks#1168 (walking /proc to obtain connections is very expensive) Fixes weaveworks#1260 (Short-lived connections not tracked for containers in shared networking namespaces)
The variable EBPF_IMAGE in the Makefile determines where to fetch the pre-compiled ebpf programs. They are added in the Scope Docker image in /usr/libexec/scope/. This is so far less than 50KB. Then, at run-time, Scope will pick the correct ebpf program for the currently running kernel. This is determined by: - /etc/os-release bind-mounted from the host gives the distribution ID. - uname finds the architecture and the kernel version.
This patch adds a debug message to show the perf event details.
We now use CGO which requires an arm compiler and the CC environment variable set up. Also, the version of go shipped on debian doesn't have some standard library packages for arm and when it tries to install them it fails because it doesn't have permission to write to /usr/lib. Use -pkgdir if we build for arm so packages are installed in $HOME
The cyclomatic complexity was too high and the CI didn't pass.
To handle non-linux systems.
a84a81e
to
c8396aa
Compare
This patch adds a debug message to show the perf event details.
We now use CGO which requires an arm compiler and the CC environment variable set up. Also, the version of go shipped on debian doesn't have some standard library packages for arm and when it tries to install them it fails because it doesn't have permission to write to /usr/lib. Use -pkgdir if we build for arm so packages are installed in $HOME
This goroutine is unnecessary because the `run()` function uses a goroutine itself. Goroutines are asynchronous, and they can introduce race between connections detected during the first pass and events form eBPF.
This prevents a race between conntrack information and ebpf. When using eBPF, we do a first pass on `/proc` and with conntrack to get the initial state, then eBPF takes over. To do so we "feed" this information to eBPF so it can pair active connection with future close events. It could happen that a connection is closed after the data from `/proc` and conntrack are populated and before eBPF takes over. In this case, we lose the close event, and we will have a "ghost" process that will never disappear from the report.
f16ea55
to
5f82411
Compare
IMAGE_TAG=$(shell ./tools/image-tag) | ||
EBPF_IMAGE=kinvolk/tcptracer-bpf:semaphore-master-475be63 |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
Superseded by #2135 |
Based on work from Lorenzo, updated by Iago, Alban, and Alessandro
This PR adds connection tracking using eBPF. This feature is not enabled by default.
For now, you can enable it by launching scope with the following command:
Scope Probe also falls back on the old /proc parsing if eBPF is not working
(e.g. too old kernel, or pre-compiled eBPF program missing for the current kernel).
This patch allows scope to get notified of every connection event, without relying
on the parsing of /proc/$pid/net/tcp{,6} and /proc/$pid/fd/*, and therefore
improve performance.
We vendor https://github.com/kinvolk/gobpf-elf-loader in Scope to load and
receive information from the ebpf programs. The Scope build fetches the pre-compiled
ebpf programs from https://hub.docker.com/r/kinvolk/tcptracer-bpf/
(see https://github.com/kinvolk/tcptracer-bpf)
At the moment, the eBPF programs are pre-compiled for:
The ebpf programs use kprobes on the following kernel functions:
It generates "connect", "accept" and "close" events containing the
connection tuple but also pid and netns.
Note: the IPv6 events are not plugged in Scope.
probe/endpoint/ebpf.go maintains the list of connections. Similarly to
conntrack, it also keeps the dead connections for one iteration in order
to report short-lived connections.
The code for parsing /proc/$pid/net/tcp{,6} and /proc/$pid/fd/* is still
there and still used at start-up because eBPF only brings us the events
and not the initial state. However, the /proc parsing for the initial
state is now done in foreground instead of background, via newForegroundReader().
NAT resolution on connections from eBPF works in the same way as it did on connections
from /proc: by using conntrack. One of the two conntrack instances was removed since
eBPF detects short-lived connections.
The Scope Docker image size comparison:
(uncompressed)
(uncompressed)
Fixes #1168 (walking /proc to obtain connections is very expensive)
Fixes #1260 (Short-lived connections not tracked for containers in
shared networking namespaces)
Fixes #1962 (Port ebpf tracker to Go)
Fixes #1961 (Remove runtime kernel header dependency from ebpf tracker)
Supersedes #2024