Kernel panic - not syncing: Hard LOCKUP - v3.4 #164

kllk2320 · 2023-05-18T05:52:30Z

Hi, we recently experienced 2 cases of crash caused by hard lockup in SCST. Here is the crash log:

[1448765.536893] Kernel panic - not syncing: Hard LOCKUP
[1448765.536911] CPU: 8 PID: 62464 Comm: kworker/8:0 ve: 0 Kdump: loaded Tainted: G           OE  ------------   3.10.0-1160.80.1.vz7.191.4 #1 191.4
[1448765.536950] Hardware name: Supermicro SSG-5049P-E1CTR36L/X11SPH-nCTF, BIOS 3.1 05/21/2019
[1448765.536984] Workqueue: events scst_ext_blocking_done_fn [scst]
[1448765.537003] Call Trace:
[1448765.537013]  <NMI>  [<ffffffffbd9f058a>] dump_stack+0x19/0x1f
[1448765.537036]  [<ffffffffbd9e9380>] panic+0xe8/0x21f
[1448765.537054]  [<ffffffffbd231998>] ? show_regs+0x58/0x290
[1448765.537073]  [<ffffffffbd2a1a13>] nmi_panic+0x43/0x50
[1448765.537091]  [<ffffffffbd36e299>] watchdog_overflow_callback+0x119/0x140
[1448765.537115]  [<ffffffffbd3ca577>] __perf_event_overflow+0x57/0x100
[1448765.537136]  [<ffffffffbd3d4074>] perf_event_overflow+0x14/0x20
[1448765.537156]  [<ffffffffbd20b220>] handle_pmi_common+0x1a0/0x260
[1448765.537178]  [<ffffffffbd5d02a8>] ? ioremap_page_range+0x2e8/0x490
[1448765.537201]  [<ffffffffbd431204>] ? vunmap_page_range+0x224/0x450
[1448765.537223]  [<ffffffffbd6a1646>] ? ghes_copy_tofrom_phys+0x116/0x220
[1448765.537244]  [<ffffffffbd20b51f>] intel_pmu_handle_irq+0xcf/0x1c0
[1448765.537266]  [<ffffffffbd9fa039>] perf_event_nmi_handler+0x39/0x60
[1448765.537287]  [<ffffffffbd9fba9c>] nmi_handle.isra.0+0x8c/0x150
[1448765.537308]  [<ffffffffbd9fbd78>] do_nmi+0x218/0x460
[1448765.537326]  [<ffffffffbd9fadf7>] end_repeat_nmi+0x1e/0x81
[1448765.537350]  [<ffffffffbd333042>] ? native_queued_spin_lock_slowpath+0x122/0x200
[1448765.537374]  [<ffffffffbd333042>] ? native_queued_spin_lock_slowpath+0x122/0x200
[1448765.537399]  [<ffffffffbd333042>] ? native_queued_spin_lock_slowpath+0x122/0x200
[1448765.537422]  <EOE>  [<ffffffffbd9ea3a2>] queued_spin_lock_slowpath+0xb/0x13
[1448765.537448]  [<ffffffffbd9f94fb>] _raw_spin_lock_irqsave+0x3b/0x40
[1448765.537469]  [<ffffffffbd2dd498>] __wake_up_common_lock+0x68/0xc0
[1448765.537490]  [<ffffffffbd2dcf20>] ? task_rq_unlock+0x20/0x20
[1448765.538076]  [<ffffffffbd2dd503>] __wake_up+0x13/0x20
[1448765.538669]  [<ffffffffc0f6775d>] scst_sync_ext_blocking_done+0x1d/0x30 [scst]
[1448765.539267]  [<ffffffffc0f776a3>] __scst_ext_blocking_done+0x63/0xc0 [scst]
[1448765.539871]  [<ffffffffc0f77748>] scst_ext_blocking_done_fn+0x48/0xa0 [scst]
[1448765.540473]  [<ffffffffbd2c6f05>] process_one_work+0x185/0x450
[1448765.541063]  [<ffffffffbd2c7bf6>] worker_thread+0x126/0x3c0
[1448765.541647]  [<ffffffffbd2c7ad0>] ? manage_workers.isra.26+0x2b0/0x2b0
[1448765.542236]  [<ffffffffbd2cf601>] kthread+0xd1/0xe0
[1448765.542827]  [<ffffffffbd2cf530>] ? create_kthread+0x60/0x60
[1448765.543403]  [<ffffffffbda0429d>] ret_from_fork_nospec_begin+0x7/0x21
[1448765.543974]  [<ffffffffbd2cf530>] ? create_kthread+0x60/0x60

Here is what I found from the crash dump:

The wait_queue that scst_sync_ext_blocking_done() tries to wake up has been corrupted:

*(wait_queue_head_t *)0xffff8ce2da637dd8 = {
	.lock = (spinlock_t){
		.rlock = (struct raw_spinlock){
			.raw_lock = (arch_spinlock_t){
				.val = (atomic_t){
					.counter = (int)-1164386304,
				},
			},
		},
	},
	.task_list = (struct list_head){
		.next = (struct list_head *)0xffff8ce2da637e40,
		.prev = (struct list_head *)0x82,
	},
}

The waiting task whose stack that the wait_queue was declared on is in a wrong context:

#0  schedule (kernel/sched/core.c:3896)
#1  do_wait (kernel/exit.c:1752)
#2  SYSC_waitid (kernel/exit.c:1802)
#3  SyS_waitid (kernel/exit.c:1762)
#4  system_call+0x16a/0x4fa (arch/x86/kernel/entry_64.S:508)
#5  0x40b5ae

whereas a correct context should be like this:

#0  schedule (kernel/sched/core.c:3896)
#1  scst_ext_block_dev (/usr/src/debug/scst-3.4.0.7477.1/scst/src/scst_lib.c:14976)
#2  scst_dev_block_store (/usr/src/debug/scst-3.4.0.7477.1/scst/src/scst_sysfs.c:3837)
#3  scst_store (/usr/src/debug/scst-3.4.0.7477.1/scst/src/scst_sysfs.c:817)
#4  sysfs_kf_write (fs/sysfs/file.c:115)
#5  kernfs_fop_write (fs/kernfs/file.c:308)
#6  vfs_write (fs/read_write.c:529)
#7  SYSC_write (fs/read_write.c:578)
#8  SyS_write (fs/read_write.c:570)
#9  system_call+0x16a/0x4fa (arch/x86/kernel/entry_64.S:508)
#10 0x40b5ae

It seems that the waiting task had been woke up by some signals and returned from the waiting loop in wait_event_locked() and exited from kernel space before the wake_up call is called in scst_sync_ext_blocking_done().
Below is a possible scenario:

The waiting task sleeping on the wait queue declared on its own kernel stack,
The waiting condition dev->on_dev_cmd_count == 0 becomes true, and dev->ext_blockers_work is scheduled by not executed
The waiting task receives a signal and is woke up by the signal, it finds that the waiting condition is true, so break from the wait loop and returns from kernel space. At this point, the data on its kernel stack is no longer valid. The waiting task could re-enter kernel via a system-call, overriding kernel stack from previous context.
The ext_blockers_work is executed, and try to wake up the wait queue that was declared on the waiting task kernel stack. Though the wait queue pointer still points to a valid address, the context at that address is no longer valid. wait_up tries to grab the corrupted spin_lock, therefore causing the hard lockup.

After analysing the code, it appears to me that an uninterruptible wait is more appropriate to be used in scst_ext_block_dev(). My question is if there is any particular reason that interruptible wait is used here?

The text was updated successfully, but these errors were encountered:

kllk2320 · 2023-05-18T05:55:57Z

Though the issue happened in v3.4, I think it affects all versions.

lnocturno · 2023-06-09T12:25:26Z

Hi,

Thank you for the report and investigation!

I will try to look at the problem next week.

Thanks,
Gleb

This patch modifies scst_sync_ext_block_dev() to support INTERRUPTIBLE waiting and handle signal-induced waiting cancellation. To achieve this, the waitqueue head is moved from the stack and allocated with the blocker. Additionally, reference counting and its management are added to the blocker to handle memory freeing from multiple contexts. Fixes: #164

lnocturno · 2023-06-16T09:25:20Z

Hi,

I've created a PR (#172).
Could you retest the issue with these patches? You can just use the following branch https://github.com/SCST-project/scst/tree/3.8/gleb/fix_hard_lockup

Thanks,
Gleb

This patch modifies scst_sync_ext_block_dev() to support INTERRUPTIBLE waiting and handle signal-induced waiting cancellation. To achieve this, the waitqueue head is moved from the stack and allocated with the blocker. Additionally, reference counting and its management are added to the blocker to handle memory freeing from multiple contexts. Fixes: #164

lnocturno · 2023-06-20T06:56:13Z

Hi,

The PR (#172) has been merged into the master branch.
If you reproduce this problem with the master branch again feel free to reopen the issue.

Thanks,
Gleb

kllk2320 · 2023-06-21T03:54:40Z

Thanks for your work.

lnocturno mentioned this issue Jun 16, 2023

scst_lib: Enable scst_sync_ext_block_dev() to handle signals #172

Merged

lnocturno closed this as completed in #172 Jun 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kernel panic - not syncing: Hard LOCKUP - v3.4 #164

Kernel panic - not syncing: Hard LOCKUP - v3.4 #164

kllk2320 commented May 18, 2023

kllk2320 commented May 18, 2023

lnocturno commented Jun 9, 2023

lnocturno commented Jun 16, 2023

lnocturno commented Jun 20, 2023

kllk2320 commented Jun 21, 2023

Kernel panic - not syncing: Hard LOCKUP - v3.4 #164

Kernel panic - not syncing: Hard LOCKUP - v3.4 #164

Comments

kllk2320 commented May 18, 2023

kllk2320 commented May 18, 2023

lnocturno commented Jun 9, 2023

lnocturno commented Jun 16, 2023

lnocturno commented Jun 20, 2023

kllk2320 commented Jun 21, 2023