Skip to content

Commit

Permalink
Add eBPF connection tracking without dependencies on kernel headers
Browse files Browse the repository at this point in the history
Based on work from Lorenzo, updated by Iago, Alban, Alessandro and
Michael.

This PR adds connection tracking using eBPF. This feature is not enabled by default.
For now, you can enable it by launching scope with the following command:

```
sudo ./scope launch --probe.ebpf.connections=true
```

This patch allows scope to get notified of every connection event,
without relying on the parsing of /proc/$pid/net/tcp{,6} and
/proc/$pid/fd/*, and therefore improve performance.

We vendor https://github.com/iovisor/gobpf in Scope to load the
pre-compiled ebpf program and https://github.com/kinvolk/tcptracer-bpf
to guess the offsets of the structures we need in the kernel. In this
way we don't need a different pre-compiled ebpf object file per kernel.
The Scope build fetches the pre-compiled ebpf program from
https://hub.docker.com/r/kinvolk/tcptracer-bpf/ (see
https://github.com/kinvolk/tcptracer-bpf). To update to a new version
you can modify the EBPF_IMAGE variable in Makefile.

The ebpf program uses kprobes/kretprobes on the following kernel functions:
- tcp_v4_connect
- tcp_v6_connect
- tcp_set_state
- inet_csk_accept
- tcp_close

It generates "connect", "accept" and "close" events containing the
connection tuple but also pid and netns.
Note: the IPv6 events are not supported in Scope and thus not passed on.

probe/endpoint/ebpf.go maintains the list of connections. Similarly to
conntrack, it also keeps the dead connections for one iteration in order
to report short-lived connections.

The code for parsing /proc/$pid/net/tcp{,6} and /proc/$pid/fd/* is still
there and still used at start-up because eBPF only brings us the events
and not the initial state. However, the /proc parsing for the initial
state is now done in foreground instead of background, via
newForegroundReader().

NAT resolution on connections from eBPF works in the same way as it did
on connections from /proc: by using conntrack. One of the two conntrack
instances is only started to get the initial state and then it is
stopped since eBPF detects short-lived connections.

The Scope Docker image size comparison:
- weaveworks/scope in current master:  22 MB (compressed),  68 MB
  (uncompressed)
- weaveworks/scope with this patchset: 23 MB (compressed), 69 MB
  (uncompressed)

Fixes weaveworks#1168 (walking /proc to obtain connections is very expensive)

Fixes weaveworks#1260 (Short-lived connections not tracked for containers in
shared networking namespaces)

Fixes weaveworks#1962 (Port ebpf tracker to Go)

Fixes weaveworks#1961 (Remove runtime kernel header dependency from ebpf tracker)
  • Loading branch information
iaguis authored and alban committed Mar 5, 2017
1 parent 6abeb5f commit 3f2f4da
Show file tree
Hide file tree
Showing 19 changed files with 492 additions and 47 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ prog/scope
docker/scope
docker/docker.tgz
docker/docker
docker/ebpf.tgz
docker/weave
docker/weaveutil
docker/runsvinit
Expand Down
31 changes: 26 additions & 5 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -24,14 +24,31 @@ RM=--rm
RUN_FLAGS=-ti
BUILD_IN_CONTAINER=true
GO_ENV=GOGC=off
GO=env $(GO_ENV) go
NO_CROSS_COMP=unset GOOS GOARCH
GO_HOST=$(NO_CROSS_COMP); $(GO)
WITH_GO_HOST_ENV=$(NO_CROSS_COMP); $(GO_ENV)
GO_ENV_ARM=$(GO_ENV) CC=/usr/bin/arm-linux-gnueabihf-gcc
GO_BUILD_INSTALL_DEPS=-i
GO_BUILD_TAGS='netgo unsafe'
GO_BUILD_FLAGS=$(GO_BUILD_INSTALL_DEPS) -ldflags "-extldflags \"-static\" -X main.version=$(SCOPE_VERSION) -s -w" -tags $(GO_BUILD_TAGS)

ifeq ($(GOOS),linux)
GO_ENV+=CGO_ENABLED=1
endif

ifeq ($(GOARCH),arm)
GO=env $(GO_ENV_ARM) go
# The version of go shipped on debian doesn't have some standard library
# packages for arm and when it tries to install them it fails because it
# doesn't have permission to write to /usr/lib
# Use -pkgdir if we build for arm so packages are installed in $HOME
GO_BUILD_FLAGS+=-pkgdir ~
else
GO=env $(GO_ENV) go
endif

NO_CROSS_COMP=unset GOOS GOARCH
GO_HOST=$(NO_CROSS_COMP); env $(GO_ENV) go
WITH_GO_HOST_ENV=$(NO_CROSS_COMP); $(GO_ENV)
IMAGE_TAG=$(shell ./tools/image-tag)
EBPF_IMAGE=kinvolk/tcptracer-bpf:master-769adde

all: $(SCOPE_EXPORT)

Expand Down Expand Up @@ -59,7 +76,11 @@ docker/docker: $(DOCKER_DISTRIB)

$(CLOUD_AGENT_EXPORT): docker/Dockerfile.cloud-agent docker/$(SCOPE_EXE) docker/docker docker/weave docker/weaveutil

$(SCOPE_EXPORT): docker/Dockerfile.scope $(CLOUD_AGENT_EXPORT) docker/$(RUNSVINIT) docker/demo.json docker/run-app docker/run-probe docker/entrypoint.sh
docker/ebpf.tgz: Makefile
$(SUDO) docker pull $(EBPF_IMAGE)
CONTAINER_ID=$(shell $(SUDO) docker run -d $(EBPF_IMAGE) /bin/false 2>/dev/null || true); $(SUDO) docker export -o docker/ebpf.tgz $${CONTAINER_ID}

$(SCOPE_EXPORT): docker/Dockerfile.scope $(CLOUD_AGENT_EXPORT) docker/$(RUNSVINIT) docker/demo.json docker/run-app docker/run-probe docker/entrypoint.sh docker/ebpf.tgz

$(RUNSVINIT): vendor/runsvinit/*.go

Expand Down
6 changes: 4 additions & 2 deletions backend/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
FROM golang:1.7.4
FROM ubuntu:yakkety
ENV GOPATH /go
ENV PATH /go/bin:/usr/lib/go-1.7/bin:/usr/bin:/bin:/usr/sbin:/sbin
RUN apt-get update && \
apt-get install -y libpcap-dev python-requests time file shellcheck && \
apt-get install -y libpcap-dev python-requests time file shellcheck golang-1.7 git gcc-arm-linux-gnueabihf && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
RUN go clean -i net && \
go install -tags netgo std && \
Expand Down
2 changes: 1 addition & 1 deletion circle.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ test:
parallel: true
- cd $SRCDIR; make RM= client-lint static:
parallel: true
- cd $SRCDIR; rm -f prog/scope; if [ "$CIRCLE_NODE_INDEX" = "0" ]; then GOARCH=arm make GO_BUILD_INSTALL_DEPS= RM= prog/scope; else GOOS=darwin make GO_BUILD_INSTALL_DEPS= RM= prog/scope; fi:
- cd $SRCDIR; rm -f prog/scope; if [ "$CIRCLE_NODE_INDEX" = "0" ]; then GOARCH=arm GOOS=linux make GO_BUILD_INSTALL_DEPS= RM= prog/scope; else GOOS=darwin GOOS=linux make GO_BUILD_INSTALL_DEPS= RM= prog/scope; fi:
parallel: true
- cd $SRCDIR; rm -f prog/scope; make RM=:
parallel: true
Expand Down
15 changes: 15 additions & 0 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
FROM alpine:3.5
MAINTAINER Weaveworks Inc <help@weave.works>
LABEL works.weave.role=system
WORKDIR /home/weave
RUN apk add --update bash runit conntrack-tools iproute2 util-linux curl && \
rm -rf /var/cache/apk/*
ADD ./docker.tgz /
ADD ./ebpf.tgz /usr/libexec/scope/
ADD ./demo.json /
ADD ./weave /usr/bin/
COPY ./scope ./runsvinit ./entrypoint.sh /home/weave/
COPY ./run-app /etc/service/app/run
COPY ./run-probe /etc/service/probe/run
EXPOSE 4040
ENTRYPOINT ["/home/weave/entrypoint.sh"]
12 changes: 6 additions & 6 deletions probe/endpoint/conntrack.go
Original file line number Diff line number Diff line change
Expand Up @@ -65,14 +65,14 @@ type conntrack struct {
// flowWalker is something that maintains flows, and provides an accessor
// method to walk them.
type flowWalker interface {
walkFlows(f func(flow))
walkFlows(f func(flow, bool))
stop()
}

type nilFlowWalker struct{}

func (n nilFlowWalker) stop() {}
func (n nilFlowWalker) walkFlows(f func(flow)) {}
func (n nilFlowWalker) stop() {}
func (n nilFlowWalker) walkFlows(f func(flow, bool)) {}

// conntrackWalker uses the conntrack command to track network connections and
// implement flowWalker.
Expand Down Expand Up @@ -463,14 +463,14 @@ func (c *conntrackWalker) handleFlow(f flow, forceAdd bool) {

// walkFlows calls f with all active flows and flows that have come and gone
// since the last call to walkFlows
func (c *conntrackWalker) walkFlows(f func(flow)) {
func (c *conntrackWalker) walkFlows(f func(flow, bool)) {
c.Lock()
defer c.Unlock()
for _, flow := range c.activeFlows {
f(flow)
f(flow, true)
}
for _, flow := range c.bufferedFlows {
f(flow)
f(flow, false)
}
c.bufferedFlows = c.bufferedFlows[:0]
}
256 changes: 256 additions & 0 deletions probe/endpoint/ebpf.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,256 @@
package endpoint

import (
"bytes"
"encoding/binary"
"net"
"strconv"
"sync"

log "github.com/Sirupsen/logrus"
bpflib "github.com/iovisor/gobpf/elf"
"github.com/kinvolk/tcptracer-bpf/pkg/byteorder"
"github.com/kinvolk/tcptracer-bpf/pkg/offsetguess"
)

type eventType uint32

// These constants should be in sync with the equivalent definitions in the ebpf program.
const (
_ eventType = iota
EventConnect
EventAccept
EventClose
)

func (e eventType) String() string {
switch e {
case EventConnect:
return "connect"
case EventAccept:
return "accept"
case EventClose:
return "close"
default:
return "unknown"
}
}

// tcpEvent should be in sync with the struct in the ebpf maps.
type tcpEvent struct {
// Timestamp must be the first field, the sorting depends on it
Timestamp uint64

CPU uint64
Type uint32
Pid uint32
Comm [16]byte
SAddr uint32
DAddr uint32
SPort uint16
DPort uint16
NetNS uint32
}

// An ebpfConnection represents a TCP connection
type ebpfConnection struct {
tuple fourTuple
networkNamespace string
incoming bool
pid int
}

type eventTracker interface {
handleConnection(eventType string, tuple fourTuple, pid int, networkNamespace string)
hasDied() bool
run()
walkConnections(f func(ebpfConnection))
initialize()
isInitialized() bool
stop()
}

var ebpfTracker *EbpfTracker

// nilTracker is a tracker that does nothing, and it implements the eventTracker interface.
// It is returned when the useEbpfConn flag is false.
type nilTracker struct{}

func (n nilTracker) handleConnection(_ string, _ fourTuple, _ int, _ string) {}
func (n nilTracker) hasDied() bool { return true }
func (n nilTracker) run() {}
func (n nilTracker) walkConnections(f func(ebpfConnection)) {}
func (n nilTracker) initialize() {}
func (n nilTracker) isInitialized() bool { return false }
func (n nilTracker) stop() {}

// EbpfTracker contains the sets of open and closed TCP connections.
// Closed connections are kept in the `closedConnections` slice for one iteration of `walkConnections`.
type EbpfTracker struct {
sync.Mutex
reader *bpflib.Module
initialized bool
dead bool

openConnections map[string]ebpfConnection
closedConnections []ebpfConnection
}

func newEbpfTracker(useEbpfConn bool) eventTracker {
if !useEbpfConn {
return &nilTracker{}
}

bpfObjectFile, err := findBpfObjectFile()
if err != nil {
log.Errorf("Cannot find BPF object file: %v", err)
return &nilTracker{}
}

bpfPerfEvent := bpflib.NewModule(bpfObjectFile)
if bpfPerfEvent == nil {
return &nilTracker{}
}
err = bpfPerfEvent.Load()
if err != nil {
log.Errorf("Error loading BPF program: %v", err)
return &nilTracker{}
}

bpfPerfEvent.EnableKprobes()

tracker := &EbpfTracker{
openConnections: map[string]ebpfConnection{},
reader: bpfPerfEvent,
}
tracker.run()

ebpfTracker = tracker
return tracker
}

func (t *EbpfTracker) handleConnection(eventType string, tuple fourTuple, pid int, networkNamespace string) {
t.Lock()
defer t.Unlock()
log.Debugf("handleConnection(%v, [%v:%v --> %v:%v], pid=%v, netNS=%v)",
eventType, tuple.fromAddr, tuple.fromPort, tuple.toAddr, tuple.toPort, pid, networkNamespace)

switch eventType {
case "connect":
conn := ebpfConnection{
incoming: false,
tuple: tuple,
pid: pid,
networkNamespace: networkNamespace,
}
t.openConnections[tuple.String()] = conn
case "accept":
conn := ebpfConnection{
incoming: true,
tuple: tuple,
pid: pid,
networkNamespace: networkNamespace,
}
t.openConnections[tuple.String()] = conn
case "close":
if deadConn, ok := t.openConnections[tuple.String()]; ok {
delete(t.openConnections, tuple.String())
t.closedConnections = append(t.closedConnections, deadConn)
} else {
log.Errorf("EbpfTracker error: unmatched close event: %s pid=%d netns=%s", tuple.String(), pid, networkNamespace)
}
}
}

func tcpEventCallback(event tcpEvent) {
var alive bool
typ := eventType(event.Type)
pid := event.Pid & 0xffffffff

saddrbuf := make([]byte, 4)
daddrbuf := make([]byte, 4)

byteorder.Host.PutUint32(saddrbuf, uint32(event.SAddr))
byteorder.Host.PutUint32(daddrbuf, uint32(event.DAddr))

sIP := net.IPv4(saddrbuf[0], saddrbuf[1], saddrbuf[2], saddrbuf[3])
dIP := net.IPv4(daddrbuf[0], daddrbuf[1], daddrbuf[2], daddrbuf[3])

sport := event.SPort
dport := event.DPort

if typ.String() == "close" || typ.String() == "unknown" {
alive = true
} else {
alive = false
}
tuple := fourTuple{sIP.String(), dIP.String(), uint16(sport), uint16(dport), alive}

log.Debugf("tcpEventCallback(%v, [%v:%v --> %v:%v], pid=%v, netNS=%v, cpu=%v, ts=%v)",
typ.String(), tuple.fromAddr, tuple.fromPort, tuple.toAddr, tuple.toPort, pid, event.NetNS, event.CPU, event.Timestamp)
ebpfTracker.handleConnection(typ.String(), tuple, int(pid), strconv.FormatUint(uint64(event.NetNS), 10))
}

// walkConnections calls f with all open connections and connections that have come and gone
// since the last call to walkConnections
func (t *EbpfTracker) walkConnections(f func(ebpfConnection)) {
t.Lock()
defer t.Unlock()

for _, connection := range t.openConnections {
f(connection)
}
for _, connection := range t.closedConnections {
f(connection)
}
t.closedConnections = t.closedConnections[:0]
}

func (t *EbpfTracker) run() {
if err := offsetguess.Guess(t.reader); err != nil {
log.Errorf("%v\n", err)
return
}

channel := make(chan []byte)

go func() {
var event tcpEvent
for {
data := <-channel
err := binary.Read(bytes.NewBuffer(data), byteorder.Host, &event)
if err != nil {
log.Errorf("Failed to decode received data: %s\n", err)
continue
}
tcpEventCallback(event)
}
}()

pmIPv4, err := bpflib.InitPerfMap(t.reader, "tcp_event_ipv4", channel)
if err != nil {
log.Errorf("%v\n", err)
return
}

pmIPv4.PollStart()
}

func (t *EbpfTracker) hasDied() bool {
t.Lock()
defer t.Unlock()

return t.dead
}

func (t *EbpfTracker) initialize() {
t.initialized = true
}

func (t *EbpfTracker) isInitialized() bool {
return t.initialized
}

func (t *EbpfTracker) stop() {
// TODO: stop the go routine in run()
}
7 changes: 7 additions & 0 deletions probe/endpoint/ebpf_linux.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
//+build linux

package endpoint

func findBpfObjectFile() (string, error) {
return "/usr/libexec/scope/ebpf/ebpf.o", nil
}
Loading

0 comments on commit 3f2f4da

Please sign in to comment.