Skip to content
This repository has been archived by the owner on Apr 20, 2021. It is now read-only.

Add eBPF connection tracking #2

Closed
wants to merge 7 commits into from
Closed

Add eBPF connection tracking #2

wants to merge 7 commits into from

Conversation

asymmetric
Copy link

@asymmetric asymmetric commented Oct 28, 2016

Introduces connection tracking via eBPF. This allows scope to get notified of every connection event, without relying on the parsing of /proc/net, and therefore:

  • improve performance
  • increase reliability

How it works

...

Limitations:

  • Does not support IPv6
  • Sets procspied: true on connections coming from eBPF
  • Reads 5000 bytes of /proc/net/tcp, whereas the actual file could be larger

@alban
Copy link

alban commented Oct 28, 2016

You should have a better commit message of course, explain the purpose, how it works, limitations ;)

and for the final PR, mentions which bug it fixes: 1168, possibly 1260, etc.

@alban
Copy link

alban commented Oct 28, 2016

Limitations:

  • IPv6
  • add yours here...

#
# tcpv4tracer Trace TCP IPv4 connections.
# For Linux, uses BCC, eBPF. Embedded C.
#
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a reference of similar tools on the bcc repo: iovisor/bcc#762

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of reference do you mean exactly? BTW this whole file, including this section, is going to change with @iaguis's new tracer.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like:

# tcpv4tracer is similar to https://github.com/iovisor/bcc/pull/762

)

func main() {
tr := endpoint.NewEbpfTracker("/home/asymmetric/code/kinvolk/bcc/examples/tracing/tcpv4tracer.py")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hardcoded path.

This file should be converted to Go tests and add some real tests.

parser = argparse.ArgumentParser(
description="Trace TCP IPv4 connections",
formatter_class=argparse.RawDescriptionHelpFormatter)
parser.add_argument("-p", "--pid",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the --pid option is not used in Scope. Do you want to keep it to keep close to the bcc or to remove it?

#define TCP_EVENT_TYPE_ACCEPT 2
#define TCP_EVENT_TYPE_CLOSE 3

struct tcp_event_t {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that we diverge from bcc here. So you could remove --pid too I guess..

return 0;
}


Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double empty line

func (n nilTracker) initialize() {}
func (n nilTracker) isInitialized() bool { return false }

// EbpfTracker contains the list of eBPF events, and the eBPF script's command
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the list of eBPF events

It is no longer the list of eBPF events but the set of IPv4 TCP connections.

the eBPF script's command

That comment comes from when the script path was in the struct. It is not true anymore.

}()

reader := bufio.NewReader(stdout)
// skip fist line
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/fist/first

either explain why you skip the first line or just remove that, add a --no-header option to the python script so that it does not print the header.

}

// key is a sortable direction-independent key for tuples, used to look up a
// fourTuple, when you are unsure of it's direction.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/it's/its

initialized bool
dead bool

activeFlows map[string]ebpfConnection
Copy link
Author

@asymmetric asymmetric Oct 28, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alban I get the point of trying to mirror the conntrack.go module, but would it make more sense to just say activeConnections, walkConnections,... ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds ok to me.

// Walker, and then allows other concurrent readers to Walk that copy.
// CachingWalker is a walker that wraps a platform-specific walker (source),
// triggers it at every Tick() and caches a copy of the output, to allow
// other concurrent readers to Walk that copy.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this file belong to another PR: they are unrelated to the ebpf work.

dead bool

activeFlows map[string]ebpfConnection
bufferedFlows []ebpfConnection
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also call this closedConnections, much more descriptive.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alban, what do you think?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Can you add a comment saying that they are recently closed connections and they are not staying forever here but only until the next walkConnections

p.AddTicker(processCache)
// if eBPF tracking is enabled, scan /proc synchronously, and just once
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that if eBPF tracking is enabled but does not work (old kernel, or no kernel header or whatever), we need to fall back correctly to the old connection scanner (the asynchronous one)...

@@ -158,6 +164,7 @@ func probeMain(flags probeFlags, targets []appclient.Target) {
SpyProcs: flags.spyProcs,
UseConntrack: flags.useConntrack,
WalkProc: flags.procEnabled,
EbpfEnabled: flags.ebpfEnabled,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could track several things with ebpf: processes, connection, ios...

but this PR is only about tracking connections. We should have a better name for the option than a generic "ebpf", we should mention "connections" somewhere.

Ditto for the "EbpfTracker"

t.fromAddr, t.fromPort, t.toAddr, t.toPort = t.toAddr, t.toPort, t.fromAddr, t.fromPort
}

// reverse flips the direction of a tuple, without side effects
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alban do you think it makes sense to have both the function and the method?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if they are used, yes. One is more demanding on the GC than the other.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but I could use just the method (which I suppose is less demanding on the GC). I only created the function because I think it looks better :)

}

func (t *EbpfTracker) initialize() {
t.initialized = true
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alban do you think it's good practice to wrap this in a mutex?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not seem necessary: t.initialized is only read & written from the reporter thread and not from the ebpf tracker "run()" thread.

@asymmetric asymmetric changed the title eBPF Add eBPF connection tracking Oct 28, 2016
@alban
Copy link

alban commented Oct 28, 2016

Can you add in the PR the list of tests you did:

  • ✅ get initial connections
  • ✅ get connections from NAT (tested with Kubernetes Services)
  • ❌ falling back to the old parsing method when ebpf does not work at all (old kernel)
  • ❌ falling back to the old parsing method when ebpf stops working (python script crashed)
  • ❌ IPv6
  • ❓ connection between 2 containers using the same network namespace
  • ✅ displaying connection in the container view when both the source and the destination are pods with multiple containers

@alepuccetti
Copy link

We are making the official one on scope, don't need this anymore

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants