Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add systemd-resoved listener #582

Merged

Conversation

calesanz
Copy link
Contributor

Thanks for the great software. I recently rediscovered opensnitch and I am very happy with it.

For a while now I have been using systemd-resolved combined with DNS over TLS (DoT).
When using DoT or DoH opensnitch cannot intercept the dns packets. Therfore the UI always shows IP addresses instead of hostnames.

To fix this issue I created a watch on the systemd-resolved debug logs. The hostnames are parsed from this log file.
It is definitly not the most elegant solution to this problem, but it worked for me.

The debug log must be enabled manually using the following command. sudo resolvectl log-level debug

I would be very happy for feedback or alternative solutions to the problem.

Probably this functionality should be optional because it might use more resources and it adds a delay before the popup message.
Also the systemd library used requires cgo.

Implementation Details
Systemd-resolved writes the dns resolution messages into its debug log. The messages can be found using the following command.
journalctl -u systemd-resolved.service | grep 'Sending message'

The messages are in the following format:

Dec 29 16:03:31 notebook systemd-resolved[1234]: varlink-21: Sending message: {"parameters":{"addresses":[{"ifindex":3,"family":2,"address":[185,199,109,133]},{"ifindex":3,"family":2,"address":[185,199,110,133]},{"ifindex":3,"family":2,"address":[185,199,111,133]},{"ifindex":3,"family":2,"address":[185,199,108,133]},{"ifindex":3,"family":10,"address":[38,6,80,192,128,3,0,0,0,0,0,0,0,0,1,84]},{"ifindex":3,"family":10,"address":[38,6,80,192,128,0,0,0,0,0,0,0,0,0,1,84]},{"ifindex":3,"family":10,"address":[38,6,80,192,128,1,0,0,0,0,0,0,0,0,1,84]},{"ifindex":3,"family":10,"address":[38,6,80,192,128,2,0,0,0,0,0,0,0,0,1,84]}],"name":"avatars.githubusercontent.com","flags":123123}}

The delay in the acceptOrDeny function is necessary because the actual packet arrives earlier than the dns message.

Possible Alternatives
I considered the following alternatives.

At first I was thinking about hooking into nsswitch. As far as I understand it this is not possible without hooking into glibc (using LD_PREALOAD).

The other option would be to use the systemd-resolved stub resolver. Using this option systemd-resolved listens on 127.0.0.53 and serves answers from this address.
This does not work for processes which use the resolved library directly, which on my system is the majority. I was able to test it using nslookup.
When using nslookup a observable DNS request is made to 127.0.0.53 this type of request already works with the current opensnitch behaviour, as it is an unencrypted DNS message.

As far as I could see systemd-resolved does not provide an interface to intercept the DNS responses.
Maybe some messages could be intercepted by using a DBUS inspector.

When sending a USR1 signal to sytemd-resolved it dumps the last resolved messages into the log.
The problem with this approach is that the following actions must be run between the arrival of the packet and the user popup.

sudo kill -USR1 $(pidof systemd-resolved)
journalctl -u systemd-resolved --output=json | grep 'IN A'

@gustavo-iniguez-goya
Copy link
Collaborator

Hi @calesanz !

Thank you for investigating this problem and actually provide a solution!

The delay in the acceptOrDeny function is necessary because the actual packet arrives earlier than the dns message.

This is a problem. The delay will be valid depending on the load of the system, amount of network traffic, etc. Sometimes 400ms will be enough, sometimes not. And it'll affect all connections, instead of only domain name resolutions.

I've tried out to monitor all dbus events with this example, but it hasn't intercepted systemd-resolved calls.

At first I was thinking about hooking into nsswitch. As far as I understand it this is not possible without hooking into glibc (using LD_PREALOAD).

What if instead of using LD_PRELOAD, we hook gethost* using uprobes? https://github.com/iovisor/bpftrace/blob/master/tools/gethostlatency.bt

However, I've briefly analyzed firefox, and it connects to systemd-resolved via dbus/unix sockets, so this approach wouldn't be valid. But maybe we can find a function that we can hook using uprobes.

@calesanz
Copy link
Contributor Author

calesanz commented Jan 7, 2022

I'll look into the uprobes hooks and update the PR.

@gustavo-iniguez-goya
Copy link
Collaborator

Here are some examples that use uprobes: https://github.com/redcanaryco/redcanary-ebpf-sensor/tree/main/src

@calesanz
Copy link
Contributor Author

calesanz commented Jan 9, 2022

Hi @gustavo-iniguez-goya

I have created a poc using ebpf. It hooks getaddrinfo, gethostbyname and gethostbyname2.

It works for firefox and chromium based browsers. It also works for local hostnames based on nsswitch.conf (/etc/hosts)

Some feedback on the code would be greatly appreciated. I will have to do more testing (using it for a few days).

There is still the race condition between the ebpf event and the actual first network packet. Maybe the delay must be introduced again.
Eventually it suffices to update the structure shortly before showing the ui.

@gustavo-iniguez-goya
Copy link
Collaborator

gustavo-iniguez-goya commented Jan 10, 2022

Tested and at first glance it works well! now domain names are tracked, even configuring systemd-resolved with DnsOverTLS=yes .

Some notes:

  • I'd apreciate some enums to identify some numbers, like:
    ebpfhook.go -> data.type = 1;

    On the other hand, that field doesn't seem be used on the golang code (field Mtype), what was it for?

  • If you enable in Firefox -> about:Preferences -> General -> Network settings -> [x] Enable DNS over HTTPS, then it doesn't intercept the domains.

The only problem, is that by embedding the ebpf code in go, makes us depending on libbcc, which is not available in some distributions that users still use. That's why we compile and distribute the ebpf code as a separate module, to load it dynamically. see: https://github.com/evilsocket/opensnitch/tree/master/ebpf_prog

Maybe this is the leak you mentioned in the code? (just after launching the daemon, compiled with -race)

getaddrinfo fatal error: checkptr: pointer arithmetic computed bad pointer value

goroutine 93 [running]: runtime.throw({0xc98b6d, 0x47f179}) /usr/lib/go-1.17/src/runtime/panic.go:1198 +0x71 fp=0xc001868b08 sp=0xc001868ad8 pc=0x44a851 runtime.checkptrArithmetic(0xa, {0x0, 0x3, 0x2}) /usr/lib/go-1.17/src/runtime/checkptr.go:52 +0xbb fp=0xc001868b38 sp=0xc001868b08 pc=0x41c27b github.com/iovisor/gobpf/bcc.bpfOpenPerfBuffer.func1(0x1, 0x4807c5, 0x0) /home/ga/go/pkg/mod/github.com/iovisor/gobpf@v0.2.0/bcc/perf.go:216 +0x85 fp=0xc001868bc0 sp=0xc001868b38 pc=0x640f25 github.com/iovisor/gobpf/bcc.bpfOpenPerfBuffer(0x0, 0xc0018fced0, 0x8) /home/ga/go/pkg/mod/github.com/iovisor/gobpf@v0.2.0/bcc/perf.go:217 +0x58 fp=0xc001868c38 sp=0xc001868bc0 pc=0x640d38 github.com/iovisor/gobpf/bcc.InitPerfMapWithPageCnt(0xc0018fa840, 0xc000095020, 0x0, 0xc7d410) /home/ga/go/pkg/mod/github.com/iovisor/gobpf@v0.2.0/bcc/perf.go:155 +0x485 fp=0xc001868d90 sp=0xc001868c38 pc=0x640305 github.com/iovisor/gobpf/bcc.InitPerfMap(...) /home/ga/go/pkg/mod/github.com/iovisor/gobpf@v0.2.0/bcc/perf.go:124 github.com/evilsocket/opensnitch/daemon/dns.DnsListenerEbpf() /tmp/xxx/5822/opensnitch/daemon/dns/ebpfhook.go:261 +0x8b7 fp=0xc001868fa0 sp=0xc001868d90 pc=0x6f26f7

@calesanz
Copy link
Contributor Author

Thanks for the feedback:

the data.type was used during debugging I am happy to remove it.

It appears that Firefox has implemented doh themselves. It might be necessary to hook a function in firefox to capture the query.
https://hg.mozilla.org/mozilla-central/file/tip/netwerk/dns/ODoH.h

I have not yet found the best location to capture the messages.

I did not know about the -race flag. What a great feature! Ill have to test that out.
I have to further debug the code. I noticed that after hibernation sometimes the opensnitch popup shows addresses instead of hostnames.

@gustavo-iniguez-goya
Copy link
Collaborator

Thank you @calesanz , yes please, remove all the debug messages (like that fprint overthere). log.Debug() is fine.

All in all, despite not working with Firefox or other quirks, I think is really good to have this feature. We can expand it in the future for other tasks.

The important thing is to deploy it as a module, to not depend on libbcc.

(I'll be a bit away from the computer the following days)

@calesanz
Copy link
Contributor Author

calesanz commented Jan 25, 2022

I have struggled quite a lot converting the bcc code to plain ebpf.

The communication with usermode and the "verification" of the ebpf programs works so far.
I still have some Issues to fix, I updated the PR to inform you about the progress.

  • IPv6 Adresss not resolved correctly
  • gethostbyname dumps some memory instead of the actual hostname (ip works fine)
  • squash commits

@gustavo-iniguez-goya
Copy link
Collaborator

Thank you @calesanz !! ❤️

It looks really nice. I'll test it soon.

@gustavo-iniguez-goya
Copy link
Collaborator

now that the v.1.5.0 is out I'll test it this PR in a couple of days. I've realized while testing v1.5.0 on OpenSuse, that if you allow the ncsd daemon to resolve domains, then some apps access the internet without asking:

opensuse-nscd-issue.mp4

As far as I can tell, pings connects to a local unix socket on /var/run/ncsd/socket, delegating the domain resolution to ncsd:

sfd = connect(AF_UNIX, "/var/run/nscd/socket");
sendto(sfd, "www.duckduckgo.com");

Apparently nscd uses gethostbyname, so maybe this PR will help also on this case 🤞

Anyway, I cannot explain why we don't detect that ping is trying to access the interne, and instead, detecting that the app making the request is nscd I guess that we should detect it regardless who resolves the domain 🤔

@gustavo-iniguez-goya
Copy link
Collaborator

tested @calesanz , everything working as expected: now, domain names resolved via systemd-resolved configured with DNSSEC=yes and DNSOverTLS=yes are detected and reported, instead of showing only the IP.

One minor thing regarding the location of the libc var libcFile = "/lib/libc.so.6":

  • On rpm based systems (OpenSuse/SuSe, CentOS, Oracle, Fedora) it's installed at /lib64/libc.so.6
  • On Debian and Ubuntu (and others I guess) x86_64: `/lib/x86_64-linux-gnu/libc.so.6
  • On Debian i386: /lib/i386-linux-gnu/libc.so.6

Could you test if these locations exist until one is found? and fail otherwise.

@calesanz
Copy link
Contributor Author

calesanz commented Feb 8, 2022

I think I found a better way to handle the libc.so path resolution.
The new code looks at the currently imported libs and searches for libc.so and uses this path.

What do you think of it?

@gustavo-iniguez-goya
Copy link
Collaborator

much better, it works fine. I compiled it for arm64 and it detected the libc.so.6 path correctly (/lib/aarch64-linux-gnu/libc.so.6).
On this system though (and armhf I guess), it doesn't get the symbols:

found /lib/aarch64-linux-gnu/libc.so.6
[2022-02-09 10:23:16]  WAR  EBPF-DNS: Failed to find symbol for uprobe uretprobe/gethostbyname : no symbol section
[2022-02-09 10:23:16]  WAR  EBPF-DNS: Failed to find symbol for uprobe uprobe/getaddrinfo : no symbol section
[2022-02-09 10:23:16]  WAR  EBPF-DNS: Failed to find symbol for uprobe uretprobe/getaddrinfo : no symbol section
$ nm -gD /lib/aarch64-linux-gnu/libc.so.6 | grep -E "(gethostbyname|getaddrinfo)"
00000000000c2a50 T getaddrinfo@@GLIBC_2.17
00000000000e72a0 T gethostbyname@@GLIBC_2.17
00000000000e74e0 T gethostbyname2@@GLIBC_2.17
00000000000e7720 T gethostbyname2_r@@GLIBC_2.17
00000000000e7c00 T gethostbyname_r@@GLIBC_2.17

$ uname -r
5.8.0-50-generic

👉 In this situation, if the uprobes can't be added we should exit from the goroutine, before entering the for loop, since we're not going to get obtain dns data from the channel.

On the other hand, better handle this error to exit successfully (just in case stopping the service wih systemctl or other tools detect it as failure and try to respawn it):

panic: runtime error: slice bounds out of range [:4] with capacity 0

goroutine 113 [running]:
github.com/evilsocket/opensnitch/daemon/dns.DnsListenerEbpf.func1(0x40014756e0)
	/opensnitch/daemon/dns/ebpfhook.go:149 +0x440

@gustavo-iniguez-goya
Copy link
Collaborator

@calesanz I don't know if you have more things to add or to fix, but for me it looks good.

When using DoT or DoH opensnitch cannot intercept the dns packets.
Therfore the UI always shows IP addresses instead of hostnames. To fix
this issue an ebpf (uprobe) filter was created to hook getaddrinfo and
gethostbyname calls.

In order to be independent of libbcc an additional module was added to
ebpf_prog. Without libbcc the libc function offsets must be resolved
manually. In order to find the loaded glibc version some cgo code was
added.
@calesanz
Copy link
Contributor Author

I have nothing to add. I just squashed the commits to have a cleaner commit log.

Unless you have additional requirements regarding tests & documentation I think it is ready.

@gustavo-iniguez-goya gustavo-iniguez-goya merged commit a4b7f57 into evilsocket:master Feb 15, 2022
@gustavo-iniguez-goya
Copy link
Collaborator

Thank you @calesanz ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants