Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws: kola podman.base tests failing with kernel warning unbind_from_irqhandler+0x4e/0x60 #606

Closed
dustymabe opened this issue Aug 19, 2020 · 12 comments

Comments

@dustymabe
Copy link
Member

We're seeing an unbind_from_irqhandler+0x4e/0x60 kernel warning. This has been happening for a while and was confused with the other kernel issue in #507.

An example of the error is in 32.20200818.20.0 (testing-devel) with kernel-5.7.15-200.fc32.x86_64.

jenkins run

Here is the full stack trace:

[    1.105786] ------------[ cut here ]------------^M
[    1.106753] WARNING: CPU: 0 PID: 1 at drivers/xen/events/events_base.c:1112 unbind_from_irqhandler+0x4e/0x60^M
[    1.106753] Modules linked in:^M
[    1.106753] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.15-200.fc32.x86_64 #1^M
[    1.106753] Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006^M
[    1.106753] RIP: 0010:unbind_from_irqhandler+0x4e/0x60^M
[    1.106753] Code: e6 89 ef e8 24 0c af ff 48 c7 c7 80 0e 97 b6 e8 38 1e 42 00 89 ef e8 51 fc ff ff 48 c7 c7 80 0e 97 b6 5d 41 5c e9 32 10 42 00 <0f> 0b 5d 41 5c c3 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00^M
[    1.106753] RSP: 0000:ffffb284c0023d60 EFLAGS: 00010246^M
[    1.106753] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000^M
[    1.106753] RDX: 0000000000000028 RSI: 00000000ffffffff RDI: 00000000ffffffff^M
[    1.106753] RBP: 00000000ffffffff R08: ffffffffb685e660 R09: 0000000000000000^M
[    1.106753] R10: 0000000000000000 R11: ffffffffb685e668 R12: 0000000000000000^M
[    1.106753] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000^M
[    1.106753] FS:  0000000000000000(0000) GS:ffff9d8747a00000(0000) knlGS:0000000000000000^M
[    1.106753] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
[    1.106753] CR2: ffff9d874ffff000 CR3: 00000000b580a001 CR4: 00000000001606f0^M
[    1.106753] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000^M
[    1.106753] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400^M
[    1.106753] Call Trace:^M
[    1.106753]  xen_uninit_lock_cpu+0x28/0x62^M
[    1.106753]  xen_hvm_cpu_die+0x21/0x30^M
[    1.106753]  takedown_cpu+0x9c/0xe0^M
[    1.106753]  ? trace_suspend_resume+0x60/0x60^M
[    1.106753]  cpuhp_invoke_callback+0x9a/0x530^M
[    1.106753]  _cpu_up+0x11a/0x130^M
[    1.106753]  cpu_up+0x7e/0xc0^M
[    1.106753]  bringup_nonboot_cpus+0x48/0x50^M
[    1.106753]  smp_init+0x26/0x79^M
[    1.106753]  kernel_init_freeable+0xea/0x229^M
[    1.106753]  ? rest_init+0xaa/0xaa^M
[    1.106753]  kernel_init+0xa/0x106^M
[    1.106753]  ret_from_fork+0x35/0x40^M
[    1.106753] ---[ end trace 251e144bc968609c ]---^M
@dustymabe
Copy link
Member Author

@dustymabe
Copy link
Member Author

Some more info here. This error happens very early on boot before systemd even starts. It seems as if there is nothing special that needs to happen in order for this "error" to occur. Our CI seems to catch it most in tests that consist of a reboot, which means that it's more likely the console output will contain the kernel trace text. I just booted a m4.large instance and it happened on first boot (without even a reboot).

[    0.832296] SELinux:  Initializing.
[    0.836341] Mount-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
[    0.843300] Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
[    0.850649] Last level iTLB entries: 4KB 64, 2MB 8, 4MB 8
[    0.855281] Last level dTLB entries: 4KB 64, 2MB 0, 4MB 0, 1GB 4
[    0.860283] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user pointer sanitization
[    0.868281] Spectre V2 : Mitigation: Full generic retpoline
[    0.873280] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
[    0.880280] Speculative Store Bypass: Vulnerable
[    0.883280] SMT: disabled
[    0.886281] SMT: disabled
[    0.888280] MDS: Vulnerable: Clear CPU buffers attempted, no microcode
[    0.893496] Freeing SMP alternatives memory: 40K
[    0.900538] clocksource: xen: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[    0.908287] Xen: using vcpuop timer interface
[    0.908294] installing Xen timer for CPU 0
[    0.911351] smpboot: CPU0: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz (family: 0x6, model: 0x4f, stepping: 0x1)
[    0.912313] cpu 0 spinlock event irq 53
[    0.913435] Performance Events: unsupported p6 CPU model 79 no PMU driver, software events only.
[    0.914322] rcu: Hierarchical SRCU implementation.
[    0.915762] NMI watchdog: Perf NMI watchdog permanently disabled
[    0.916492] smp: Bringing up secondary CPUs ...
[    0.917395] installing Xen timer for CPU 1
[    0.918344] x86: Booting SMP configuration:
[    0.919285] .... node  #0, CPUs:        #1
[    0.921312] ------------[ cut here ]------------
[    0.923289] WARNING: CPU: 0 PID: 1 at drivers/xen/events/events_base.c:1126 unbind_from_irqhandler+0x69/0x80
[    0.924283] Modules linked in:
[    0.925284] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.8.15-201.fc32.x86_64 #1
[    0.926283] Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
[    0.927285] RIP: 0010:unbind_from_irqhandler+0x69/0x80
[    0.928285] Code: ff 48 c7 c7 00 0b b8 bb 48 83 c4 08 5d e9 0f 4b 43 00 48 89 34 24 e8 16 37 a1 ff 48 8b 34 24 48 85 c0 74 06 48 8b 40 30 eb b8 <0f> 0b 48 83 c4 08 5d c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40
[    0.929283] RSP: 0000:ffff9cd900023d60 EFLAGS: 00010246
[    0.930283] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[    0.931283] RDX: 0000000000000028 RSI: 0000000000000000 RDI: 00000000ffffffff
[    0.932283] RBP: 00000000ffffffff R08: 0000000000000000 R09: ffffffffbba5dce8
[    0.933284] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8fc987a40000
[    0.934283] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
[    0.935284] FS:  0000000000000000(0000) GS:ffff8fc987a00000(0000) knlGS:0000000000000000
[    0.936285] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.937284] CR2: ffff8fc98ffff000 CR3: 000000009fa0a001 CR4: 00000000001606f0
[    0.938284] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.939283] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.940283] Call Trace:
[    0.941291]  xen_uninit_lock_cpu+0x28/0x62
[    0.942286]  xen_hvm_cpu_die+0x21/0x30
[    0.943286]  takedown_cpu+0x9c/0xe0
[    0.944286]  ? trace_suspend_resume+0x60/0x60
[    0.945284]  cpuhp_invoke_callback+0x9a/0x520
[    0.946285]  _cpu_up+0x11e/0x140
[    0.947284]  cpu_up+0x7e/0xc0
[    0.948284]  bringup_nonboot_cpus+0x48/0x50
[    0.949286]  smp_init+0x26/0x79
[    0.950286]  kernel_init_freeable+0xea/0x230
[    0.951286]  ? rest_init+0xaa/0xaa
[    0.952284]  kernel_init+0xa/0x116
[    0.953287]  ret_from_fork+0x22/0x30
[    0.954286] ---[ end trace dbf070c94fc1287e ]---
[    0.955338] smp: Brought up 1 node, 1 CPU
[    0.956286] smpboot: Max logical packages: 15
[    0.957285] smpboot: Total of 1 processors activated (4600.09 BogoMIPS)

@dustymabe
Copy link
Member Author

Note that m4.large (the instance type we're using) has 2 vCPUs. I get the kernel trace on every boot pretty reliably. If I remove the mitigations=auto,nosmt then the trace goes away. If I move to an instance with 4 vCPUs (m5.xlarge) the trace goes away.

@dustymabe
Copy link
Member Author

Though I do see the problem on m4.xlarge (with 4vCPUs and mitigations=auto,nosmt).

[    0.738153] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user pointer sanitization
[    0.743152] Spectre V2 : Mitigation: Full generic retpoline
[    0.747150] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
[    0.753151] Speculative Store Bypass: Vulnerable
[    0.756150] SMT: disabled
[    0.758153] SMT: disabled
[    0.760151] MDS: Vulnerable: Clear CPU buffers attempted, no microcode
[    0.764369] Freeing SMP alternatives memory: 40K
[    0.769975] clocksource: xen: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[    0.775159] Xen: using vcpuop timer interface
[    0.775166] installing Xen timer for CPU 0
[    0.778228] smpboot: CPU0: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz (family: 0x6, model: 0x4f, stepping: 0x1)
[    0.779188] cpu 0 spinlock event irq 53
[    0.780331] Performance Events: unsupported p6 CPU model 79 no PMU driver, software events only.
[    0.781206] rcu: Hierarchical SRCU implementation.
[    0.782626] NMI watchdog: Perf NMI watchdog permanently disabled
[    0.783376] smp: Bringing up secondary CPUs ...
[    0.784271] installing Xen timer for CPU 1
[    0.785221] x86: Booting SMP configuration:
[    0.786156] .... node  #0, CPUs:        #1
[    0.787213] cpu 1 spinlock event irq 59
[    0.792261] installing Xen timer for CPU 2
[    0.793213]   #2
[    0.795186] ------------[ cut here ]------------
[    0.797160] WARNING: CPU: 0 PID: 1 at drivers/xen/events/events_base.c:1126 unbind_from_irqhandler+0x69/0x80
[    0.798154] Modules linked in:
[    0.799155] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.8.15-201.fc32.x86_64 #1
[    0.800154] Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
[    0.801157] RIP: 0010:unbind_from_irqhandler+0x69/0x80
[    0.802155] Code: ff 48 c7 c7 00 0b b8 bb 48 83 c4 08 5d e9 0f 4b 43 00 48 89 34 24 e8 16 37 a1 ff 48 8b 34 24 48 85 c0 74 06 48 8b 40 30 eb b8 <0f> 0b 48 83 c4 08 5d c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40
[    0.803154] RSP: 0000:ffffb2a680023d60 EFLAGS: 00010246
[    0.804154] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000
[    0.805154] RDX: 0000000000000028 RSI: 0000000000000000 RDI: 00000000ffffffff
[    0.806154] RBP: 00000000ffffffff R08: 0000000000000000 R09: ffffffffbba5dce8
[    0.807154] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9dbc7fa80000
[    0.808154] R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000000
[    0.809155] FS:  0000000000000000(0000) GS:ffff9dbc7fa00000(0000) knlGS:0000000000000000
[    0.810154] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.811154] CR2: ffff9dbbba801000 CR3: 0000000339a0a001 CR4: 00000000001606f0
[    0.812156] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.813154] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.814155] Call Trace:
[    0.815161]  xen_uninit_lock_cpu+0x28/0x62
[    0.816155]  xen_hvm_cpu_die+0x21/0x30
[    0.817157]  takedown_cpu+0x9c/0xe0
[    0.818156]  ? trace_suspend_resume+0x60/0x60
[    0.819155]  cpuhp_invoke_callback+0x9a/0x520
[    0.820157]  _cpu_up+0x11e/0x140
[    0.821156]  cpu_up+0x7e/0xc0
[    0.822156]  bringup_nonboot_cpus+0x48/0x50
[    0.823158]  smp_init+0x26/0x79
[    0.824157]  kernel_init_freeable+0xea/0x230
[    0.825158]  ? rest_init+0xaa/0xaa
[    0.826155]  kernel_init+0xa/0x116
[    0.827158]  ret_from_fork+0x22/0x30
[    0.828157] ---[ end trace 89c23546888a1c40 ]---
[    0.829338] installing Xen timer for CPU 3
[    0.830222]   #3
[    0.832185] ------------[ cut here ]------------
[    0.834159] WARNING: CPU: 0 PID: 1 at drivers/xen/events/events_base.c:1126 unbind_from_irqhandler+0x69/0x80
[    0.835154] Modules linked in:
[    0.836156] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W         5.8.15-201.fc32.x86_64 #1
[    0.837154] Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
[    0.838158] RIP: 0010:unbind_from_irqhandler+0x69/0x80
[    0.839155] Code: ff 48 c7 c7 00 0b b8 bb 48 83 c4 08 5d e9 0f 4b 43 00 48 89 34 24 e8 16 37 a1 ff 48 8b 34 24 48 85 c0 74 06 48 8b 40 30 eb b8 <0f> 0b 48 83 c4 08 5d c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40
[    0.840155] RSP: 0000:ffffb2a680023d60 EFLAGS: 00010246
[    0.841154] RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
[    0.842154] RDX: 0000000000000028 RSI: 0000000000000000 RDI: 00000000ffffffff
[    0.843154] RBP: 00000000ffffffff R08: 0000000000000000 R09: ffffffffbba5dce8
[    0.844153] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9dbc7fac0000
[    0.845154] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000
[    0.846154] FS:  0000000000000000(0000) GS:ffff9dbc7fa00000(0000) knlGS:0000000000000000
[    0.847155] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.848154] CR2: ffff9dbbba801000 CR3: 0000000339a0a001 CR4: 00000000001606f0
[    0.849155] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.850153] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.851154] Call Trace:
[    0.852161]  xen_uninit_lock_cpu+0x28/0x62
[    0.853155]  xen_hvm_cpu_die+0x21/0x30
[    0.854157]  takedown_cpu+0x9c/0xe0
[    0.855155]  ? trace_suspend_resume+0x60/0x60
[    0.856155]  cpuhp_invoke_callback+0x9a/0x520
[    0.857155]  _cpu_up+0x11e/0x140
[    0.858155]  cpu_up+0x7e/0xc0

</details>
[    0.859154]  bringup_nonboot_cpus+0x48/0x50
[    0.860157]  smp_init+0x26/0x79
[    0.861156]  kernel_init_freeable+0xea/0x230
[    0.862157]  ? rest_init+0xaa/0xaa
[    0.863154]  kernel_init+0xa/0x116
[    0.864157]  ret_from_fork+0x22/0x30
[    0.865155] ---[ end trace 89c23546888a1c41 ]---

@dustymabe
Copy link
Member Author

And even on m4.2xlarge. So it continues to happen even as the number of CPUs increases.

@dustymabe
Copy link
Member Author

yep. m5.large doesn't have the problem. So it does seem to be a combination of:

  • old xen instance types (i.e. m4*)
  • mitigations=auto,nosmt

@dustymabe
Copy link
Member Author

I dusted off rpm-ostree-bisect and found that the first time this problem started showing up for us was in 31.20200108.2.0 when we went from kernel-5.3.16-300.fc31 -> kernel-5.4.7-200.fc31:

-- Logs begin at Mon 2020-11-02 14:02:24 UTC, end at Tue 2020-11-03 01:04:43 UTC. --
Nov 03 01:04:07 localhost systemd[1]: Starting RPM-OSTree Bisect Testing...
Nov 03 01:04:08 localhost python3[764]: Using data file at: /var/lib/rpm-ostree-bisect.json
Nov 03 01:04:18 ip-10-0-1-222 python3[764]: [    1.462501] WARNING: CPU: 0 PID: 1 at drivers/xen/events/events_base.c:1111 unbind_from_irqhandler+0x36/0x40
Nov 03 01:04:18 ip-10-0-1-222 python3[764]: [    1.462501] RIP: 0010:unbind_from_irqhandler+0x36/0x40
Nov 03 01:04:18 ip-10-0-1-222 python3[764]: [    1.466501] WARNING: CPU: 0 PID: 1 at drivers/xen/events/events_base.c:1111 unbind_from_irqhandler+0x36/0x40
Nov 03 01:04:18 ip-10-0-1-222 python3[764]: [    1.466501] RIP: 0010:unbind_from_irqhandler+0x36/0x40
Nov 03 01:04:18 ip-10-0-1-222 python3[764]: Found unbind_from_irqhandler kernel message. Test Fails
Nov 03 01:04:18 ip-10-0-1-222 python3[764]: Removed /etc/systemd/system/multi-user.target.wants/rpm-ostree-bisect.service.
Nov 03 01:04:18 ip-10-0-1-222 python3[764]: BISECT TEST RESULTS:
Nov 03 01:04:18 ip-10-0-1-222 python3[764]: Last known good commit:
Nov 03 01:04:18 ip-10-0-1-222 python3[764]:   fd3a3a1 : 31.20191217.2.0 : 2019-12-18T14:11:27Z
Nov 03 01:04:18 ip-10-0-1-222 python3[764]: First known bad commit:
Nov 03 01:04:18 ip-10-0-1-222 python3[764]:   5fd16ee : 31.20200108.2.0 : 2020-01-09T20:22:38Z
Nov 03 01:04:33 ip-10-0-1-222 python3[764]: libostree pull from 'fedora' for fedora/x86_64/coreos/testing complete
                                            security: GPG: commit http: TLS
                                            non-delta: meta: 79 content: 0
                                            transfer: secs: 14 size: 604.1 kB
Nov 03 01:04:33 ip-10-0-1-222 python3[764]: ostree diff commit from: fd3a3a1549de2bb9031f0767d10d2302c178dec09203a1db146e0ad28f38c498
Nov 03 01:04:33 ip-10-0-1-222 python3[764]: ostree diff commit to:   5fd16eedcb7e7f1c0935480469b0eb4fbe6cf7f94b3ad119f90058663e04adf9
Nov 03 01:04:33 ip-10-0-1-222 python3[764]: Upgraded:
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   coreos-installer 0-10.git2fcf441.fc31 -> 0.1.2-1.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   crypto-policies 20191002-1.gitc93dc99.fc31 -> 20191128-2.gitcd267a5.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   git-core 2.23.0-1.fc31 -> 2.24.1-1.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   glib2 2.62.3-1.fc31 -> 2.62.4-1.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   ignition 2.0.1-9.gita8f91fa.fc31 -> 2.1.1-3.git40c0b57.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   iptables 1.8.3-5.fc31 -> 1.8.3-7.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   iptables-libs 1.8.3-5.fc31 -> 1.8.3-7.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   iptables-nft 1.8.3-5.fc31 -> 1.8.3-7.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   iptables-services 1.8.3-5.fc31 -> 1.8.3-7.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   kernel 5.3.16-300.fc31 -> 5.4.7-200.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   kernel-core 5.3.16-300.fc31 -> 5.4.7-200.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   kernel-modules 5.3.16-300.fc31 -> 5.4.7-200.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   libblkid 2.34-3.fc31 -> 2.34-4.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   libedit 3.1-29.20191025cvs.fc31 -> 3.1-30.20191211cvs.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   libfdisk 2.34-3.fc31 -> 2.34-4.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   libldb 2.0.7-1.fc31 -> 2.0.8-1.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   libmount 2.34-3.fc31 -> 2.34-4.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   libnfsidmap 1:2.4.1-2.rc1.fc31 -> 1:2.4.2-3.rc3.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   libseccomp 2.4.2-1.fc31 -> 2.4.2-2.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   libsmartcols 2.34-3.fc31 -> 2.34-4.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   libsmbclient 2:4.11.3-0.fc31 -> 2:4.11.4-0.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   libsolv 0.7.8-1.fc31 -> 0.7.10-1.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   libssh 0.9.2-1.fc31 -> 0.9.3-1.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   libssh-config 0.9.2-1.fc31 -> 0.9.3-1.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   libtirpc 1.1.4-2.rc3.fc31 -> 1.2.5-0.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   libuuid 2.34-3.fc31 -> 2.34-4.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   libwbclient 2:4.11.3-0.fc31 -> 2:4.11.4-0.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   libxcrypt 4.4.10-1.fc31 -> 4.4.10-2.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   libxml2 2.9.10-1.fc31 -> 2.9.10-2.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   nfs-utils-coreos 1:2.4.1-2.rc1.fc31 -> 1:2.4.2-3.rc3.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   pam 1.3.1-18.fc31 -> 1.3.1-21.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   python-pip-wheel 19.1.1-6.fc31 -> 19.1.1-7.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   python3 3.7.5-1.fc31 -> 3.7.5-2.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   python3-libs 3.7.5-1.fc31 -> 3.7.5-2.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   rpm-ostree 2019.6-1.fc31 -> 2019.7-1.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   rpm-ostree-libs 2019.6-1.fc31 -> 2019.7-1.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   samba-client-libs 2:4.11.3-0.fc31 -> 2:4.11.4-0.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   samba-common 2:4.11.3-0.fc31 -> 2:4.11.4-0.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   samba-common-libs 2:4.11.3-0.fc31 -> 2:4.11.4-0.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   systemd 243.4-1.fc31 -> 243.5-1.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   systemd-container 243.4-1.fc31 -> 243.5-1.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   systemd-libs 243.4-1.fc31 -> 243.5-1.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   systemd-pam 243.4-1.fc31 -> 243.5-1.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   systemd-rpm-macros 243.4-1.fc31 -> 243.5-1.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   systemd-udev 243.4-1.fc31 -> 243.5-1.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   util-linux 2.34-3.fc31 -> 2.34-4.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   vim-minimal 2:8.1.2424-1.fc31 -> 2:8.2.076-1.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   whois-nls 5.5.3-1.fc31 -> 5.5.4-1.fc31
Nov 03 01:04:33 ip-10-0-1-222 python3[764]: Removed:
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   bc-1.07.1-9.fc31.x86_64
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   coreos-installer-dracut-0-10.git2fcf441.fc31.noarch
Nov 03 01:04:33 ip-10-0-1-222 python3[764]: Added:
Nov 03 01:04:33 ip-10-0-1-222 python3[764]:   coreos-installer-systemd-0.1.2-1.fc31.x86_64
Nov 03 01:04:33 ip-10-0-1-222 systemd[1]: Started RPM-OSTree Bisect Testing.

@masneyb
Copy link

masneyb commented Nov 6, 2020

I posted a patch that fixes the issue: https://lore.kernel.org/lkml/20201106003529.391649-1-bmasney@redhat.com/T/#u. Thanks to @dustymabe for setting up a Fedora system at AWS for me that reproduced the error.

Note for future reference: when compiling torvalds/master or linux-next on a Fedora 33 system, the following error is thrown:

BTFIDS  vmlinux
FAILED unresolved symbol vfs_getattr

A workaround for this is posted at https://lore.kernel.org/bpf/20201016213835.GJ1461394@krava/T/.

@dustymabe
Copy link
Member Author

Thanks so much @masneyb!

@masneyb
Copy link

masneyb commented Nov 7, 2020

@masneyb
Copy link

masneyb commented Nov 10, 2020

v2 fix has been merged into the xen subsystem tree for 5.10: https://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git/commit/?h=for-linus-5.10b&id=65cae18882f943215d0505ddc7e70495877308e6

@dustymabe dustymabe added the status/pending-upstream-release Fixed upstream. Waiting on an upstream component source code release. label Nov 10, 2020
@dustymabe
Copy link
Member Author

dustymabe commented Jun 4, 2021

v2 fix has been merged into the xen subsystem tree for 5.10: https://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git/commit/?h=for-linus-5.10b&id=65cae18882f943215d0505ddc7e70495877308e6

Looks like this is indeed fixed. Tested on m4.large instances on latest testing (34.20210529.2.0 with kernel-5.12.7-300.fc34.x86_64)

[dustymabe@media fcos]$
[dustymabe@media fcos]$ cosa kola run -p aws --aws-credentials-file=creds --aws-ami ami-0874588a8f0420e41 --aws-type m4.large --aws-region us-east-1 -b fcos -j 5
COREOS_ASSEMBLER_GIT=/var/b/shared/code/github.com/coreos/coreos-assembler/
COREOS_ASSEMBLER_CONFIG_GIT=/var/b/shared/code/github.com/coreos/fedora-coreos-config/
+ podman run --rm -ti --security-opt label=disable --privileged --uidmap=1000:0:1 --uidmap=0:1:1000 --uidmap 1001:1001:64536 -v /var/b/shared/assembler/fcos:/srv/ --device /dev/kvm --device /dev/fuse --tmpfs /tmp -v /var/tmp:/var/tmp --name cosa -v /var/b/shared/code/github.com/coreos/fedora-coreos-config/:/srv/src/config/:ro -v /var/b/shared/code/github.com/coreos/coreos-assembler//src/:/usr/lib/coreos-assembler/:ro quay.io/coreos-assembler/coreos-assembler:latest kola run -p aws --aws-credentials-file=creds --aws-ami ami-0874588a8f0420e41 --aws-type m4.large --aws-region us-east-1 -b fcos -j 5
⚠️  Skipping kola test pattern "fcos.internet":
⚠️  https://github.com/coreos/coreos-assembler/pull/1478
⚠️  Skipping kola test pattern "podman.workflow":
⚠️  https://github.com/coreos/coreos-assembler/pull/1478
kola --denylist-test fcos.internet --denylist-test podman.workflow --output-dir tmp/kola run -p aws --aws-credentials-file=creds --aws-ami ami-0874588a8f0420e41 --aws-type m4.large --aws-region us-east-1 -b fcos -j 5
=== RUN   coreos.selinux.boolean
=== RUN   rootfs.uuid
=== RUN   coreos.ignition.resource.s3
=== RUN   fcos.filesystem
=== RUN   ext.config.ignition.kargs
=== RUN   rpmostree.upgrade-rollback
=== RUN   ext.config.networking.no-default-initramfs-net-propagation
=== RUN   coreos.ignition.journald-log
=== RUN   coreos.ignition.ssh.key
=== RUN   fcos.ignition.misc.empty
=== RUN   coreos.ignition.groups
=== RUN   coreos.ignition.resource.local
=== RUN   podman.base
=== RUN   rpmostree.install-uninstall
=== RUN   ext.config.misc-ro
=== RUN   ext.config.misc-ign-ro
=== RUN   rpmostree.status
=== RUN   ostree.hotfix
=== RUN   ostree.unlock
=== RUN   coreos.ignition.security.tls
=== RUN   coreos.auth.verify
=== RUN   coreos.tls.fetch-urls
=== RUN   rhcos.selinux.boolean.persist
=== RUN   podman.network-single
=== RUN   ostree.remote
=== RUN   fcos.metadata.aws
=== RUN   ext.config.ignition.systemd-unmasking
=== RUN   fcos.users.shells
=== RUN   coreos.ignition.resource.remote
=== RUN   coreos.selinux.enforce
=== RUN   basic
=== RUN   coreos.ignition.once
=== RUN   fcos.network.listeners
=== RUN   ext.config.gshadow
=== RUN   coreos.ignition.v2.users
=== RUN   coreos.ignition.instantiated.enable-unit
=== RUN   coreos.ignition.sethostname
--- PASS: rpmostree.status (72.01s)
--- PASS: coreos.ignition.ssh.key (76.97s)
--- PASS: coreos.ignition.instantiated.enable-unit (80.45s)
=== RUN   podman.base/info
--- PASS: coreos.selinux.boolean (132.90s)
=== RUN   podman.base/resources
--- PASS: coreos.ignition.once (153.97s)
--- PASS: podman.base (87.67s)
    --- PASS: podman.base/info (8.57s)
    --- PASS: podman.base/resources (11.70s)
            cluster.go:141: Getting image source signatures
            cluster.go:141: Copying blob sha256:bfea1659690e25eed531f2ae429bcb92e1f518ee8b9a7283ec7d830caa335c60
            cluster.go:141: Copying config sha256:dae807f46f2dfe85a7524ed7cd5e81609cb39591b7d70e6719ef6e9244475e12
            cluster.go:141: Writing manifest to image destination
            cluster.go:141: Storing signatures
=== RUN   rpmostree.install-uninstall/install
--- PASS: ext.config.misc-ro (131.76s)
--- PASS: ext.config.misc-ign-ro (136.47s)
--- PASS: coreos.ignition.journald-log (71.28s)
=== RUN   rpmostree.upgrade-rollback/upgrade
--- PASS: ext.config.ignition.kargs (130.53s)
--- PASS: ext.config.networking.no-default-initramfs-net-propagation (85.29s)
=== RUN   ostree.remote/add
=== RUN   ostree.remote/list
=== RUN   ostree.remote/show-url
=== RUN   ostree.remote/refs
=== RUN   ostree.remote/summary
=== RUN   ostree.remote/delete
--- PASS: ostree.remote (89.00s)
    --- PASS: ostree.remote/add (0.94s)
    --- PASS: ostree.remote/list (0.49s)
    --- PASS: ostree.remote/show-url (0.43s)
    --- PASS: ostree.remote/refs (0.59s)
    --- PASS: ostree.remote/summary (1.04s)
    --- PASS: ostree.remote/delete (1.36s)
=== RUN   rpmostree.install-uninstall/uninstall
=== RUN   rpmostree.upgrade-rollback/rollback
=== RUN   basic/DbusPerms
=== RUN   basic/NetworkScripts
=== RUN   basic/Useradd
=== RUN   basic/MachineID
=== RUN   basic/PortSSH
=== RUN   basic/ServicesActive
=== RUN   basic/ReadOnly
=== RUN   basic/FCOSGrowpart
--- PASS: basic (97.55s)
    --- PASS: basic/DbusPerms (0.60s)
    --- PASS: basic/NetworkScripts (0.39s)
    --- PASS: basic/Useradd (0.80s)
    --- PASS: basic/MachineID (0.41s)
    --- PASS: basic/PortSSH (0.40s)
    --- PASS: basic/ServicesActive (0.43s)
    --- PASS: basic/ReadOnly (0.44s)
    --- PASS: basic/FCOSGrowpart (0.45s)
--- PASS: coreos.ignition.resource.remote (67.88s)
--- PASS: coreos.selinux.enforce (140.38s)
--- PASS: rpmostree.install-uninstall (312.48s)
    --- PASS: rpmostree.install-uninstall/install (138.99s)
    --- PASS: rpmostree.install-uninstall/uninstall (94.56s)
--- PASS: fcos.users.shells (70.53s)
--- PASS: rpmostree.upgrade-rollback (239.35s)
    --- PASS: rpmostree.upgrade-rollback/upgrade (77.64s)
    --- PASS: rpmostree.upgrade-rollback/rollback (83.78s)
--- PASS: ext.config.ignition.systemd-unmasking (86.11s)
--- PASS: fcos.metadata.aws (66.11s)
--- PASS: fcos.ignition.misc.empty (65.76s)
--- PASS: ext.config.gshadow (77.34s)
--- PASS: coreos.ignition.v2.users (74.97s)
--- PASS: coreos.ignition.sethostname (60.86s)
--- PASS: fcos.network.listeners (73.60s)
--- PASS: coreos.auth.verify (69.95s)
--- PASS: coreos.ignition.resource.local (151.67s)
--- PASS: coreos.tls.fetch-urls (71.89s)
=== RUN   ostree.unlock/unlock
=== RUN   ostree.unlock/install
=== RUN   ostree.unlock/uninstall
=== RUN   ostree.unlock/discard
--- PASS: rhcos.selinux.boolean.persist (129.84s)
--- PASS: coreos.ignition.resource.s3 (65.68s)
--- PASS: ostree.unlock (140.96s)
    --- PASS: ostree.unlock/unlock (2.18s)
    --- PASS: ostree.unlock/install (2.41s)
    --- PASS: ostree.unlock/uninstall (2.16s)
    --- PASS: ostree.unlock/discard (62.92s)
--- PASS: coreos.ignition.security.tls (158.26s)
=== RUN   fcos.filesystem/suid
=== RUN   fcos.filesystem/sgid
=== RUN   fcos.filesystem/writablefiles
=== RUN   fcos.filesystem/writabledirs
=== RUN   fcos.filesystem/stickydirs
=== RUN   fcos.filesystem/denylist
--- PASS: fcos.filesystem (79.18s)
    --- PASS: fcos.filesystem/suid (1.34s)
    --- PASS: fcos.filesystem/sgid (0.93s)
    --- PASS: fcos.filesystem/writablefiles (0.61s)
    --- PASS: fcos.filesystem/writabledirs (0.56s)
    --- PASS: fcos.filesystem/stickydirs (0.56s)
    --- PASS: fcos.filesystem/denylist (0.58s)
--- PASS: coreos.ignition.groups (73.96s)
--- PASS: podman.network-single (244.87s)
        cluster.go:141: Getting image source signatures
        cluster.go:141: Copying blob sha256:d24855c9aa746e4cd75697d3b3fd33713fc97e102b18977ee7d8c5f04043143a
        cluster.go:141: Copying config sha256:a05201a10f8d29aea51eb9cf97e5f0dce5491035af5dbefd21d7f944ec295342
        cluster.go:141: Writing manifest to image destination
        cluster.go:141: Storing signatures
=== RUN   rootfs.uuid/RandomUUID
--- PASS: rootfs.uuid (81.43s)
    --- PASS: rootfs.uuid/RandomUUID (0.58s)
=== RUN   ostree.hotfix/unlock
=== RUN   ostree.hotfix/install
=== RUN   ostree.hotfix/uninstall
=== RUN   ostree.hotfix/persist
=== RUN   ostree.hotfix/rollback
--- PASS: ostree.hotfix (232.42s)
    --- PASS: ostree.hotfix/unlock (13.76s)
    --- PASS: ostree.hotfix/install (1.88s)
    --- PASS: ostree.hotfix/uninstall (1.52s)
    --- PASS: ostree.hotfix/persist (61.71s)
    --- PASS: ostree.hotfix/rollback (80.15s)
PASS, output in tmp/kola
+ rc=0
+ set +x

@dustymabe dustymabe removed the status/pending-upstream-release Fixed upstream. Waiting on an upstream component source code release. label Jun 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants