Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eBPF tracking causes panic on Ubuntu kernel 4.4.0-119 #3131

Closed
leth opened this issue Apr 5, 2018 · 15 comments
Closed

eBPF tracking causes panic on Ubuntu kernel 4.4.0-119 #3131

leth opened this issue Apr 5, 2018 · 15 comments
Labels
bug Broken end user or developer functionality; not working as the developers intended it

Comments

@leth
Copy link
Contributor

leth commented Apr 5, 2018

Full kernel version: 4.4.0-119-generic #143-Ubuntu SMP
We were previously running 4.4.0-116-generic #140-Ubuntu SMP with no problems

Workaround:

Disable eBPF connection tracking with --probe.ebpf.connections=false

Panic details

panic.txt

@leth leth added the bug Broken end user or developer functionality; not working as the developers intended it label Apr 5, 2018
@dlespiau
Copy link
Contributor

dlespiau commented Apr 5, 2018

It'd be interesting to also have the known to work version so one can look at the list of commits that found their way in.

@rade
Copy link
Member

rade commented Apr 5, 2018

@alban @iaguis any clues what might be going wrong here?

@alban
Copy link
Contributor

alban commented Apr 5, 2018

Summary of the stack:

- bpf_map_lookup_elem
- bpf_prog_run
- fd_install
- do_sys_open

So the BPF program tries to read a eBPF map because of a open() syscall in the scope process:
https://github.com/weaveworks/tcptracer-bpf/blob/fecaba5/tcptracer-bpf.c#L873

	exists = bpf_map_lookup_elem(&fdinstall_pids, &tgid);

The message "BUG: unable to handle kernel paging request at 0000000063222428" might be an invalid pointer in the bpf_map_lookup_elem() function:
https://github.com/torvalds/linux/blob/v4.4/kernel/bpf/helpers.c#L41

	value = map->ops->map_lookup_elem(map, key);

Maybe map or map->ops is an invalid pointer? No idea why that would happen right now... is it 100% reproducible?

Given the timestamps, it seems to happen very soon after the machine boots, so probably during the initialization of Scope.

@rade
Copy link
Member

rade commented Apr 5, 2018

Given the timestamps, it seems to happen very soon after the machine boots, so probably during the initialization of Scope.

Yes. We've had a couple of machines in a k8s cluster in reboot loops since updating to the 4.4.0-119-generic #143-Ubuntu SMP kernel.

@alban
Copy link
Contributor

alban commented Apr 5, 2018

I see one patch on bpf maps in that new kernel (found from here):
https://pastebin.ubuntu.com/p/y58KDXwTMn/

Maybe it's a kernel patch that was badly backported to the Ubuntu 4.4 kernel?

@JuneZhao
Copy link

JuneZhao commented Apr 9, 2018

[ 263.736006] Modules linked in: xt_nat xt_recent ipt_REJECT nf_reject_ipv4 xt_mark binfmt_misc xt_comment ebtable_nat ebtables xt_REDIRECT nf_nat_redirect xt_tcpudp iptable_security ipt_MASQUERADE nf_nat_masquerade_ipv4 xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter ip_tables xt_conntrack x_tables nf_nat nf_conntrack br_netfilter bridge stp llc overlay input_leds i2c_piix4 hv_balloon 8250_fintek joydev serio_raw mac_hid ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic hv_netvsc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel hid_hyperv hid hv_storvsc hv_utils ptp pps_core hyperv_keyboard scsi_transport_fc aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd hyperv_fb psmouse pata_acpi floppy hv_vmbus fjes
[ 263.736006] CPU: 0 PID: 6309 Comm: scope Not tainted 4.4.0-119-generic #143-Ubuntu
[ 263.736006] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017
[ 263.736006] task: ffff88011cef5400 ti: ffff88000a0e4000 task.ti: ffff88000a0e4000
[ 263.736006] RIP: 0010:[] [] bpf_map_lookup_elem+0x6/0x20
[ 263.736006] RSP: 0018:ffff88000a0e7a70 EFLAGS: 00010082
[ 263.736006] RAX: ffffffff8117cd70 RBX: ffffc90000762068 RCX: 0000000000000000
[ 263.736006] RDX: 0000000000000000 RSI: ffff88000a0e7cd8 RDI: 000000001cdee380
[ 263.736006] RBP: ffff88000a0e7cf8 R08: 0000000005080021 R09: 0000000000000000
[ 263.736006] R10: 0000000000000020 R11: ffff880159e1c700 R12: 0000000000000000
[ 263.736006] R13: ffff88011cfaf400 R14: ffff88000a0e7e38 R15: ffff88000a0f8800
[ 263.736006] FS: 00007f5b0cd79700(0000) GS:ffff88015b600000(0000) knlGS:0000000000000000
[ 263.736006] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 263.736006] CR2: 000000001cdee3a8 CR3: 000000011ce04000 CR4: 0000000000040670
[ 263.736006] Stack:
[ 263.736006] ffff88000a0e7cf8 ffffffff81177411 0000000000000000 00001887000018a5
[ 263.736006] 000000001cdee380 ffff88000a0e7cd8 0000000000000000 0000000000000000
[ 263.736006] 0000000005080021 ffff88000a0e7e38 0000000000000000 0000000000000046
[ 263.736006] Call Trace:
[ 263.736006] [] ? __bpf_prog_run+0x7a1/0x1360
[ 263.736006] [] ? update_curr+0x79/0x170
[ 263.736006] [] ? update_cfs_shares+0xbc/0x100
[ 263.736006] [] ? update_curr+0x79/0x170
[ 263.736006] [] ? dput+0xb8/0x230
[ 263.736006] [] ? follow_managed+0x265/0x300
[ 263.736006] [] ? kmem_cache_alloc_trace+0x1d4/0x1f0
[ 263.736006] [] ? seq_open+0x5a/0xa0
[ 263.736006] [] ? probes_open+0x33/0x100
[ 263.736006] [] ? dput+0x34/0x230
[ 263.736006] [] ? mntput+0x24/0x40
[ 263.736006] [] trace_call_bpf+0x37/0x50
[ 263.736006] [] kretprobe_perf_func+0x3d/0x250
[ 263.736006] [] ? pre_handler_kretprobe+0x135/0x1b0
[ 263.736006] [] kretprobe_dispatcher+0x3d/0x60
[ 263.736006] [] ? do_sys_open+0x1b2/0x2a0
[ 263.736006] [] ? kretprobe_trampoline_holder+0x9/0x9
[ 263.736006] [] trampoline_handler+0x133/0x210
[ 263.736006] [] ? do_sys_open+0x1b2/0x2a0
[ 263.736006] [] kretprobe_trampoline+0x25/0x57
[ 263.736006] [] ? kretprobe_trampoline_holder+0x9/0x9
[ 263.736006] [] SyS_openat+0x14/0x20
[ 263.736006] [] entry_SYSCALL_64_fastpath+0x1c/0xbb
[ 263.736006] Code: 41 be 01 00 00 00 e8 fa bd ff ff 49 89 c5 eb 94 e8 f0 14 0a 00 4c 89 eb e9 e2 fe ff ff e8 a3 60 f0 ff 0f 1f 00 66 66 66 66 90 55 <48> 8b 47 28 48 89 e5 48 8b 40 18 e8 8a 83 6d 00 5d c3 0f 1f 84
[ 263.736006] RIP [] bpf_map_lookup_elem+0x6/0x20
[ 263.736006] RSP
[ 263.736006] CR2: 000000001cdee3a8
[ 263.736006] ---[ end trace 751aaf991017019f ]---
[ 263.736006] Kernel panic - not syncing: Fatal exception
[ 263.736006] Kernel Offset: disabled
[ 263.736006] Rebooting in 10 seconds..
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Initializing cgroup subsys cpuacct
[ 0.000000] Linux version 4.4.0-119-generic (buildd@lcy01-amd64-013) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9) ) #143-Ubuntu SMP Mon Apr 2 16:08:24 UTC 2018 (Ubuntu 4.4.0-119.143-generic 4.4.114)
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.4.0-119-generic root=UUID=567ab888-a3b5-43d4-a92a-f594e8653924 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 rootdelay=300
[ 0.000000] KERNEL supported cpus:
[ 0.000000] Intel GenuineIntel

Experienced with my k8s cluster and everytime i restarted etcd the ssh connection will hang for a while sometimes I dont restart etcd it hang there
@leth May I know which file should I edit to disable this --probe.ebpf.connections=false

@leth
Copy link
Contributor Author

leth commented Apr 9, 2018

To apply the --probe.ebpf.connections=false argument to weave scope, you will need to edit the kubernetes daemonset manifest for weave scope. You can find it either in the weave, or kube-system namespace.

As this change will affect all nodes, you can make this change from another working node.

@JuneZhao
Copy link

JuneZhao commented Apr 9, 2018

@leth thank you, one thing I am not clear is that where does this weave scope come from? I didnt create it by myself and I dont k8s ever created it previously? Does this weave scope comes with this kernel only? And it only applies to k8s?

@leth
Copy link
Contributor Author

leth commented Apr 9, 2018

Weave scope is not part of the kernel, it uses the eBPF kernel feature.
Weave scope can be either manually installed on your system, or installed as part of connecting your system to Weave Cloud.
Weave scope can run on natively on your system, or within docker, or within kubernetes.

We have only seen the crash I posted under kernel 4.4.0-119-generic #143-Ubuntu SMP so far.

Your report says "scope Not tainted" I'm not a kernel debugging expert, but this might mean that weave scope is not at fault?

@bboreham
Copy link
Collaborator

I tried it with just Scope, no Kubernetes or other stuff:
Before:

$ uname -a
Linux bryan-dev2 4.4.0-101-generic #124-Ubuntu SMP Fri Nov 10 18:29:59 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
$ scope launch
ebba7e65f5940c0eca410ca8f3d89f4839963ea5d1f0d52f79f5325a8762b50a
Scope probe started
Weave Scope is listening at the following URL(s):
  * http://10.240.0.2:4040/

After:

$ uname -a
Linux bryan-dev2 4.4.0-119-generic #143-Ubuntu SMP Mon Apr 2 16:08:24 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ scope launch
0a3b2a7a3580f2100f2374e334eea104a1f6f3371659de905db99c98942b1257
Scope probe started
ERRO[0002] error getting events from daemon: unexpected EOF 
sh: ../sysdeps/nptl/fork.c:156: __libc_fork: Assertion `THREAD_GETMEM (self, tid) != ppid' failed.
Aborted
[machine is unresponsive]

schu added a commit to kinvolk-archives/scope that referenced this issue Apr 13, 2018
The Ubuntu Xenial update to kernel 4.4.0-119.143 from 4.4.0-116.140 did
include a regression in the eBPF code. A basic `bpf_map_lookup_elem`
call as found in the tcptracer-bpf library used by Scope leads to a
kernel panic. As a result, Scope / the system crashes during startup
when the tcptracer is initialized. The Scope bug report can be found
here:

weaveworks#3131

To avoid crashes and gently fallback to procfs (as Scope already does
for systems not supporting eBPF), update `isKernelSupported()` and
explicitly check for Ubuntu Kernel versions with the problem.

Once the bug is fixed and an update published, the `abiNumber` check in
`isKernelSupported()` can and should be updated with an upper limit.

The Ubuntu bug report can be found here:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1763454
@schu
Copy link
Contributor

schu commented Apr 13, 2018

Pull request for a workaround (fall back to procfs on affected kernels): #3141

Ubuntu bug report: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1763454

schu added a commit to kinvolk-archives/scope that referenced this issue Apr 13, 2018
The Ubuntu Xenial update to kernel 4.4.0-119.143 from 4.4.0-116.140 did
include a regression in the eBPF code. A basic `bpf_map_lookup_elem`
call as found in the tcptracer-bpf library used by Scope leads to a
kernel panic. As a result, Scope / the system crashes during startup
when the tcptracer is initialized. The Scope bug report can be found
here:

weaveworks#3131

To avoid crashes and gently fallback to procfs (as Scope already does
for systems not supporting eBPF), update `isKernelSupported()` and
explicitly check for Ubuntu Kernel versions with the problem.

Once the bug is fixed and an update published, the `abiNumber` check in
`isKernelSupported()` can and should be updated with an upper limit.

The Ubuntu bug report can be found here:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1763454
schu added a commit to kinvolk-archives/scope that referenced this issue Apr 13, 2018
The Ubuntu Xenial update to kernel 4.4.0-119.143 from 4.4.0-116.140 did
include a regression in the eBPF code. A basic `bpf_map_lookup_elem`
call as found in the tcptracer-bpf library used by Scope leads to a
kernel panic. As a result, Scope / the system crashes during startup
when the tcptracer is initialized. The Scope bug report can be found
here:

weaveworks#3131

To avoid crashes and gently fallback to procfs (as Scope already does
for systems not supporting eBPF), update `isKernelSupported()` and
explicitly check for Ubuntu Kernel versions with the problem.

Once the bug is fixed and an update published, the `abiNumber` check in
`isKernelSupported()` can and should be updated with an upper limit.

The Ubuntu bug report can be found here:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1763454
schu added a commit to kinvolk-archives/scope that referenced this issue Apr 13, 2018
The Ubuntu Xenial update to kernel 4.4.0-119.143 from 4.4.0-116.140 did
include a regression in the eBPF code. A basic `bpf_map_lookup_elem`
call as found in the tcptracer-bpf library used by Scope leads to a
kernel panic. As a result, Scope / the system crashes during startup
when the tcptracer is initialized. The Scope bug report can be found
here:

weaveworks#3131

To avoid crashes and gently fallback to procfs (as Scope already does
for systems not supporting eBPF), update `isKernelSupported()` and
explicitly check for Ubuntu Kernel versions with the problem.

Once the bug is fixed and an update published, the `abiNumber` check in
`isKernelSupported()` can and should be updated with an upper limit.

The Ubuntu bug report can be found here:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1763454
@rade rade closed this as completed in 4b85be0 Apr 16, 2018
@putz612
Copy link

putz612 commented Apr 23, 2018

If you are getting bit by this you can block this specific kernel version using apt preferences until Ubuntu releases a new version. That way you can apt-get update with without worrying about accidentally installing it.

#cat /etc/apt/preferences

Package: linux*
Pin: version 4.4.0-119*
Pin-Priority: -1

@schu
Copy link
Contributor

schu commented May 2, 2018

Update: Ubuntu has a fix in Ubuntu 4.4.0-123.147-generic 4.4.128 (xenial-proposed) that should be included in the next update cycle (as far as I can tell).

@michaelajr
Copy link

I wonder if this issue is causing my problems. I went from a Debian K8s cluster in AWS to an Ubuntu K8s cluster in AWS. Fresh cluster. Newest Ubuntu Image. Applied the WeaveScope Helm Chart - and BAM! All my hosts went out of service in the ELB (Instance status checks in AWS were failing too). When they came back in service - and I was able to do a kubectl get pods I saw all the WeaveScope pods in a crash loop backoff. Only change is Ubuntu. I know this is comment has no log or any real info. I tore down the cluster, but will post back with more info while I debug this today.

@michaelajr
Copy link

Upgrading to 1.9.0 fixed my issue.

schu added a commit to kinvolk-archives/scope that referenced this issue May 23, 2018
With c75700f we added code to detect
Ubuntu Xenial kernels with a regression in the eBPF subsystem in order
to gently fallback to procfs scanning on such systems (and not crash the
host system by running eBPF code).

With the latest kernel update for Ubuntu Xenial, the bug was fixed:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1763454

Therefore we can update the added check with an upper limit and make
sure that eBPF connection tracking only is disabled on kernels within
the range having the bug.

xref: weaveworks#3131
lilic pushed a commit to lilic/scope that referenced this issue Jul 25, 2018
With c75700f we added code to detect
Ubuntu Xenial kernels with a regression in the eBPF subsystem in order
to gently fallback to procfs scanning on such systems (and not crash the
host system by running eBPF code).

With the latest kernel update for Ubuntu Xenial, the bug was fixed:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1763454

Therefore we can update the added check with an upper limit and make
sure that eBPF connection tracking only is disabled on kernels within
the range having the bug.

xref: weaveworks#3131
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Broken end user or developer functionality; not working as the developers intended it
Projects
None yet
Development

No branches or pull requests

9 participants