-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add eBPF connection tracking without dependencies on kernel headers #2135
Add eBPF connection tracking without dependencies on kernel headers #2135
Conversation
dd77777
to
020d967
Compare
Makefile
Outdated
$(SCOPE_EXPORT): $(SCOPE_EXE) $(DOCKER_DISTRIB) docker/weave $(RUNSVINIT) docker/Dockerfile docker/demo.json docker/run-app docker/run-probe docker/entrypoint.sh | ||
docker/ebpf.tgz: Makefile | ||
$(SUDO) docker pull $(EBPF_IMAGE) | ||
CONTAINER_ID=$(shell $(SUDO) docker run -d $(EBPF_IMAGE) /bin/false 2>/dev/null || true); $(SUDO) docker export -o docker/ebpf.tgz $${CONTAINER_ID} |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
Makefile
Outdated
NO_CROSS_COMP=unset GOOS GOARCH | ||
GO_HOST=$(NO_CROSS_COMP); $(GO) | ||
WITH_GO_HOST_ENV=$(NO_CROSS_COMP); $(GO_ENV) | ||
GO_ENV_ARM=$(GO_ENV) CC=/usr/bin/arm-linux-gnueabihf-gcc |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
Makefile
Outdated
endif | ||
|
||
ifeq ($(GOARCH),arm) | ||
GO=env $(GO_ENV_ARM) go |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
Makefile
Outdated
ifeq ($(GOARCH),arm) | ||
GO=env $(GO_ENV_ARM) go | ||
# The version of go shipped on debian doesn't have some standard library | ||
# packages for arm and when it tries to install them it fails because it |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
circle.yml
Outdated
@@ -41,7 +41,7 @@ test: | |||
parallel: true | |||
- cd $SRCDIR; make RM= client-lint static: | |||
parallel: true | |||
- cd $SRCDIR; rm -f prog/scope; if [ "$CIRCLE_NODE_INDEX" = "0" ]; then GOARCH=arm make GO_BUILD_INSTALL_DEPS= RM= prog/scope; else GOOS=darwin make GO_BUILD_INSTALL_DEPS= RM= prog/scope; fi: | |||
- cd $SRCDIR; rm -f prog/scope; if [ "$CIRCLE_NODE_INDEX" = "0" ]; then GOARCH=arm GOOS=linux make GO_BUILD_INSTALL_DEPS= RM= prog/scope; else GOOS=darwin GOOS=linux make GO_BUILD_INSTALL_DEPS= RM= prog/scope; fi: |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
docker/Dockerfile
Outdated
@@ -6,6 +6,7 @@ RUN echo "http://dl-cdn.alpinelinux.org/alpine/edge/community" >>/etc/apk/reposi | |||
apk add --update bash runit conntrack-tools iproute2 util-linux curl && \ | |||
rm -rf /var/cache/apk/* | |||
ADD ./docker.tgz / | |||
ADD ./ebpf.tgz /usr/libexec/scope/ |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
@@ -463,14 +463,14 @@ func (c *conntrackWalker) handleFlow(f flow, forceAdd bool) { | |||
|
|||
// walkFlows calls f with all active flows and flows that have come and gone | |||
// since the last call to walkFlows | |||
func (c *conntrackWalker) walkFlows(f func(flow)) { | |||
func (c *conntrackWalker) walkFlows(f func(flow, bool)) { |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
We have tested (manually) on Debian, Ubuntu, Fedora, Arch, CoreOS (beta), Amazon Linux. Note that Amazon Linux does not have
We have not tested on GCE yet. |
I tried on gc with the image gci but it does not work because bpf is not supported. Error message from strace: These are the information about the gci virtual machine image:
|
Thanks @alepuccetti . Mind creating a ticket like https://code.google.com/p/google-compute-engine/issues/detail?id=499 to ask them to enable EBPF? Might be worth checking https://cloud.google.com/container-optimized-os/docs/release-notes in case it's already done on the dev channel. |
@alban Is it worth asking them to mount it by default? |
Ok, asked in that thread: https://forums.aws.amazon.com/thread.jspa?messageID=762683 |
probe/endpoint/ebpf.go
Outdated
var ebpfTracker *EbpfTracker | ||
|
||
// nilTracker is a tracker that does nothing, and it implements the eventTracker interface. | ||
// It is returned when the useEbpfConn flag is false. |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
probe/endpoint/ebpf.go
Outdated
|
||
func newEbpfTracker(useEbpfConn bool) eventTracker { | ||
if !useEbpfConn { | ||
return &nilTracker{} |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
probe/endpoint/ebpf_linux.go
Outdated
@@ -0,0 +1,7 @@ | |||
//+build linux |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
@@ -0,0 +1,256 @@ | |||
package endpoint |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
probe/endpoint/ebpf.go
Outdated
return &nilTracker{} | ||
} | ||
err = bpfPerfEvent.Load() | ||
if err != nil { |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
probe/endpoint/ebpf.go
Outdated
return &nilTracker{} | ||
} | ||
|
||
bpfPerfEvent.EnableKprobes() |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
probe/endpoint/ebpf.go
Outdated
|
||
// tcpEvent should be in sync with the struct in the ebpf maps. | ||
type tcpEvent struct { | ||
// Timestamp must be the first field, the sorting depends on it |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
probe/endpoint/ebpf.go
Outdated
sport := event.SPort | ||
dport := event.DPort | ||
|
||
if typ.String() == "close" || typ.String() == "unknown" { |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
probe/endpoint/ebpf.go
Outdated
|
||
type eventType uint32 | ||
|
||
// These constants should be in sync with the equivalent definitions in the ebpf program. |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
probe/endpoint/ebpf.go
Outdated
} | ||
|
||
func (t *EbpfTracker) run() { | ||
if err := offsetguess.Guess(t.reader); err != nil { |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
probe/endpoint/ebpf.go
Outdated
} | ||
}() | ||
|
||
pmIPv4, err := bpflib.InitPerfMap(t.reader, "tcp_event_ipv4", channel) |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
probe/endpoint/ebpf.go
Outdated
} | ||
|
||
func (t *EbpfTracker) stop() { | ||
// TODO: stop the go routine in run() |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
@@ -38,6 +38,28 @@ func newBackgroundReader(walker process.Walker) *backgroundReader { | |||
return br | |||
} | |||
|
|||
func newForegroundReader(walker process.Walker) *backgroundReader { |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
Also, regarding https://circleci.com/gh/kinvolk/scope/741 , please don't output full reports in tests since they fill the output. |
I have tried to test this branch locally and I get panics: https://gist.github.com/2opremio/8aece6a2fba726fabc6b944ab2d9ec92 |
0fc6511
to
3cf87dc
Compare
Rebased on master. |
return err | ||
} | ||
|
||
releaseParts := releaseRegex.FindStringSubmatch(release) |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
docker/Dockerfile
Outdated
@@ -0,0 +1,14 @@ | |||
FROM alpine:3.5 |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
Note that the app CPU consumption varies a bit but the probe consumption is fairly static: ~7% when enabling ebpf and ~3% without it. |
I am seeing a lot of In this case, they are DNS requests and requests to weave net (which is not running):
I truly doubt it's #2253 but simply requests to ports where no-one is listening on. In fact, if I try to connect to a port where no-one is listening:
I get this in the logs:
Please correct this |
This is tracked in weaveworks/tcptracer-bpf#19 |
Thanks. Should we silence those errors in the meantime? (they are not really errors after all). |
3cf87dc
to
a545d59
Compare
I think we should do that in the meantime, yes. |
@2opremio The connection scanner used to feed the ebpf tracker with the initial state was not stopped correctly. I added commit 915eb69 which fixes the perf issue. @schu tested that it fixes the problem: |
Good job, I will test this branch again before the day ends. |
Based on work from Lorenzo, updated by Iago, Alban, Alessandro and Michael. This PR adds connection tracking using eBPF. This feature is not enabled by default. For now, you can enable it by launching scope with the following command: ``` sudo ./scope launch --probe.ebpf.connections=true ``` This patch allows scope to get notified of every connection event, without relying on the parsing of /proc/$pid/net/tcp{,6} and /proc/$pid/fd/*, and therefore improve performance. We vendor https://github.com/iovisor/gobpf in Scope to load the pre-compiled ebpf program and https://github.com/weaveworks/tcptracer-bpf to guess the offsets of the structures we need in the kernel. In this way we don't need a different pre-compiled ebpf object file per kernel. The pre-compiled ebpf program is included in the vendoring of tcptracer-bpf. The ebpf program uses kprobes/kretprobes on the following kernel functions: - tcp_v4_connect - tcp_v6_connect - tcp_set_state - inet_csk_accept - tcp_close It generates "connect", "accept" and "close" events containing the connection tuple but also pid and netns. Note: the IPv6 events are not supported in Scope and thus not passed on. probe/endpoint/ebpf.go maintains the list of connections. Similarly to conntrack, it also keeps the dead connections for one iteration in order to report short-lived connections. The code for parsing /proc/$pid/net/tcp{,6} and /proc/$pid/fd/* is still there and still used at start-up because eBPF only brings us the events and not the initial state. However, the /proc parsing for the initial state is now done in foreground instead of background, via newForegroundReader(). NAT resolution on connections from eBPF works in the same way as it did on connections from /proc: by using conntrack. One of the two conntrack instances is only started to get the initial state and then it is stopped since eBPF detects short-lived connections. The Scope Docker image size comparison: - weaveworks/scope in current master: 22 MB (compressed), 68 MB (uncompressed) - weaveworks/scope with this patchset: 23 MB (compressed), 69 MB (uncompressed) Fixes weaveworks#1168 (walking /proc to obtain connections is very expensive) Fixes weaveworks#1260 (Short-lived connections not tracked for containers in shared networking namespaces) Fixes weaveworks#1962 (Port ebpf tracker to Go) Fixes weaveworks#1961 (Remove runtime kernel header dependency from ebpf tracker)
There's no obvious reason why those tests can only be run on us-central-1, remove the check. It was added with 1577b90
915eb69
to
6d55a34
Compare
PR updated:
Let's see if CircleCI is happy. |
Based on work from Lorenzo, updated by Iago, Alban, Alessandro and
Michael.
This PR adds connection tracking using eBPF. This feature is not enabled by default.
For now, you can enable it by launching scope with the following command:
This patch allows scope to get notified of every connection event,
without relying on the parsing of /proc/$pid/net/tcp{,6} and
/proc/$pid/fd/*, and therefore improve performance.
We vendor https://github.com/iovisor/gobpf in Scope to load the
pre-compiled ebpf program and https://github.com/kinvolk/tcptracer-bpf
to guess the offsets of the structures we need in the kernel. In this
way we don't need a different pre-compiled ebpf object file per kernel.
The Scope build fetches the pre-compiled ebpf program from
https://hub.docker.com/r/kinvolk/tcptracer-bpf/ (see
https://github.com/kinvolk/tcptracer-bpf). To update to a new version
you can modify the EBPF_IMAGE variable in Makefile.
The ebpf program uses kprobes/kretprobes on the following kernel functions:
It generates "connect", "accept" and "close" events containing the
connection tuple but also pid and netns.
Note: the IPv6 events are not supported in Scope and thus not passed on.
probe/endpoint/ebpf.go maintains the list of connections. Similarly to
conntrack, it also keeps the dead connections for one iteration in order
to report short-lived connections.
The code for parsing /proc/$pid/net/tcp{,6} and /proc/$pid/fd/* is still
there and still used at start-up because eBPF only brings us the events
and not the initial state. However, the /proc parsing for the initial
state is now done in foreground instead of background, via
newForegroundReader().
NAT resolution on connections from eBPF works in the same way as it did
on connections from /proc: by using conntrack. One of the two conntrack
instances is only started to get the initial state and then it is
stopped since eBPF detects short-lived connections.
The Scope Docker image size comparison:
(uncompressed)
(uncompressed)
Fixes #1168 (walking /proc to obtain connections is very expensive)
Fixes #1260 (Short-lived connections not tracked for containers in
shared networking namespaces)
Fixes #1962 (Port ebpf tracker to Go)
Fixes #1961 (Remove runtime kernel header dependency from ebpf tracker)
This PR is kernel version/configuration independent so it supersedes #2070 which needed a different object file per kernel/kernel config.
Depends on #2068
TODO:
--probe.ebpf.connections=true
and the ebpf loading fails for whatever reason. We can fix this in a followup PRaccept()
at start time : does not get the "accept" event for accept() syscall started before tcptracer-bpf tcptracer-bpf#10