Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We need a way to capture short lived connections. #356

Closed
tomwilkie opened this issue Aug 13, 2015 · 14 comments · Fixed by #381
Closed

We need a way to capture short lived connections. #356

tomwilkie opened this issue Aug 13, 2015 · 14 comments · Fixed by #381
Assignees

Comments

@tomwilkie
Copy link
Contributor

Conntrack will do it, but we can't map it back to a pid (at least on the client side; for servers we could look at the output of lsof)

@tomwilkie
Copy link
Contributor Author

There may be a way:

https://github.com/mk-fg/conntrack-logger

@tomwilkie
Copy link
Contributor Author

Nope, that python project is just consulting /proc/net/tcp; their approach won't work with containers, and will only capture slight more connections that we already do.

@tomwilkie tomwilkie reopened this Aug 13, 2015
@tomwilkie
Copy link
Contributor Author

@tomwilkie
Copy link
Contributor Author

netstat seems to do it by walking all the entries in proc:

http://sourceforge.net/p/net-tools/code/ci/master/tree/netstat.c

@tomwilkie
Copy link
Contributor Author

I think we're going to end up using ftrace on the connect system call https://www.kernel.org/doc/Documentation/trace/ftrace.txt

@tomwilkie tomwilkie self-assigned this Aug 19, 2015
@tomwilkie
Copy link
Contributor Author

More info:

Summary: I'm quite confident ftrace + ebpf's will work well, but I think support for this was only merged into kernel 4.1. Kernel 4.1 will most likely be in ubuntu 15.10, so it seems like a dead end for the next ~6months at least

2 ideas left (1 new one):

  • combine ftrace & conntrack - when ftrace sees a connect syscall it will know pid and fd; look this up in proc to get src ip and port, then use the asynchronously delivered conntrack info for the dst ip and port
  • forget about pid, and do a foreign key join through ip address for the containers view

@tomwilkie
Copy link
Contributor Author

Status update: got an ftrace prototype (#381), but short answer is it isn't going to work. It can't get the local addr and port number quickly enough from procfs:

Failed to get local addr for pid=708 fd=6: Fd 6 not found for proc 708
Failed to get local addr for pid=764 fd=6: Fd 6 not found for proc 764
Failed to get local addr for pid=708 fd=6: Fd 6 not found for proc 708
Failed to get local addr for pid=745 fd=6: readlink /proc/745/fd/6: no such file or directory
Failed to get local addr for pid=745 fd=8: Fd 8 not found for proc 745
Failed to get local addr for pid=764 fd=6: readlink /proc/764/fd/6: no such file or directory
Failed to get local addr for pid=708 fd=6: Fd 6 not found for proc 708
Failed to get local addr for pid=745 fd=6: Fd 6 not found for proc 745
Failed to get local addr for pid=764 fd=6: readlink /proc/764/fd/6: no such file or directory
Failed to get local addr for pid=745 fd=6: Fd 6 not found for proc 745
Failed to get local addr for pid=745 fd=6: readlink /proc/745/fd/6: no such file or directory
Failed to get local addr for pid=764 fd=6: readlink /proc/764/fd/6: no such file or directory
Failed to get local addr for pid=708 fd=6: readlink /proc/708/fd/6: no such file or directory
Failed to get local addr for pid=708 fd=8: Fd 8 not found for proc 708
Failed to get local addr for pid=746 fd=6: Fd 6 not found for proc 746
Failed to get local addr for pid=746 fd=6: Fd 6 not found for proc 746
Failed to get local addr for pid=1179 fd=6: readlink /proc/1179/fd/6: no such file or directory
Failed to get local addr for pid=1179 fd=6: readlink /proc/1179/fd/6: no such file or directory
Failed to get local addr for pid=764 fd=6: Fd 6 not found for proc 764
Failed to get local addr for pid=708 fd=6: Fd 6 not found for proc 708
Failed to get local addr for pid=745 fd=6: Fd 6 not found for proc 745
Failed to get local addr for pid=746 fd=6: Fd 6 not found for proc 746
Failed to get local addr for pid=746 fd=6: Fd 6 not found for proc 746
Failed to get local addr for pid=764 fd=6: readlink /proc/764/fd/6: no such file or directory

@tomwilkie
Copy link
Contributor Author

Looks like option (2) is the current leading only candidate - @peterbourgon?

@inercia
Copy link
Contributor

inercia commented Aug 24, 2015

@tomwilkie What about using some conntrack library in Go like this one?

@tomwilkie
Copy link
Contributor Author

@inercia checkout PR #386 - we are using exactly that library.

The problem is that aren't told the pid which is taking part in the connection by that library, so we're having to do some extra work.

@inercia
Copy link
Contributor

inercia commented Aug 24, 2015

@tomwilkie oh, I see.. and can't we obtain the PID from somewhere in /proc? /proc/<pid>/fd/<n> maps <pid:fd> to ports, so we could walk the whole /proc/<pid>s, build a map and correlate it with the conntrack info, right? Probably you already know all of this, or you even have a better solution in mind...

@tomwilkie
Copy link
Contributor Author

@inercia Thats exactly what procspy does; the problem is we can't walk too often (every few seconds) as its quite expensive, so we miss particularly short connections. This ticket is/was about finding a different way of doing it, to capture those short lived connections.

@inercia
Copy link
Contributor

inercia commented Aug 24, 2015

oh, it seems I'm quite good at stating the obvious... do you need any help with anything of this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants