Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel panic in shrinker_to_text -> strlen #731

Closed
g2p opened this issue Aug 22, 2024 · 3 comments
Closed

Kernel panic in shrinker_to_text -> strlen #731

g2p opened this issue Aug 22, 2024 · 3 comments

Comments

@g2p
Copy link
Contributor

g2p commented Aug 22, 2024

Might shrinker_to_text end up called with a null shrinker? Though I don't see how exactly.

CONFIG_SHRINKER_DEBUG was off.

From 1d875e4 (bcachefs-testing).

<4>[ 3422.835918] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623) 
<4>[ 3422.835934] ? strlen (lib/string.c:402 (discriminator 1)) 
<4>[ 3422.835944] ? seq_buf_puts (lib/seq_buf.c:186) 
<4>[ 3422.835954] shrinker_to_text (mm/shrinker.c:829) 
<4>[ 3422.835967] shrinkers_to_text (mm/shrinker.c:897 (discriminator 1)) 
<4>[ 3422.835976] ? prb_read_valid (kernel/printk/printk_ringbuffer.c:2183) 
<4>[ 3422.835985] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.835995] ? console_unlock (kernel/printk/printk.c:3137 (discriminator 1)) 
<4>[ 3422.836018] __show_mem (./include/linux/seq_buf.h:100 mm/show_mem.c:490) 
<4>[ 3422.836030] dump_header (mm/oom_kill.c:445 (discriminator 1)) 
<4>[ 3422.836040] oom_kill_process (mm/oom_kill.c:424 mm/oom_kill.c:1013) 
<4>[ 3422.836049] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.836061] out_of_memory (mm/oom_kill.c:1152) 
<4>[ 3422.833539] gcc invoked oom-killer: gfp_mask=0x440dc0(GFP_KERNEL_ACCOUNT|__GFP_COMP|__GFP_ZERO), order=0, oom_score_adj=0
<4>[ 3422.833570] CPU: 9 UID: 1000 PID: 81403 Comm: gcc Tainted: G        W   E      6.11.0-rc4-g2p #57
<4>[ 3422.833589] Tainted: [W]=WARN, [E]=UNSIGNED_MODULE
<4>[ 3422.833600] Hardware name: To Be Filled By O.E.M. X570 Phantom Gaming 4/X570 Phantom Gaming 4, BIOS P5.61 02/22/2024
<4>[ 3422.833619] Call Trace:
<4>[ 3422.833627]  <TASK>
<4>[ 3422.833637] dump_stack_lvl (lib/dump_stack.c:122) 
<4>[ 3422.833652] dump_stack (lib/dump_stack.c:129) 
<4>[ 3422.833662] dump_header (mm/oom_kill.c:74 mm/oom_kill.c:442) 
<4>[ 3422.833676] oom_kill_process (mm/oom_kill.c:424 mm/oom_kill.c:1013) 
<4>[ 3422.833687] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.833702] out_of_memory (mm/oom_kill.c:1152) 
<4>[ 3422.833717] __alloc_pages_noprof (mm/page_alloc.c:3609 mm/page_alloc.c:4371 mm/page_alloc.c:4708) 
<4>[ 3422.833734] ? lock_acquire (./include/trace/events/lock.h:24 (discriminator 2) kernel/locking/lockdep.c:5730 (discriminator 2)) 
<4>[ 3422.833760] alloc_pages_mpol_noprof (mm/mempolicy.c:2265 (discriminator 1)) 
<4>[ 3422.833773] ? lock_acquire (./include/trace/events/lock.h:24 (discriminator 2) kernel/locking/lockdep.c:5730 (discriminator 2)) 
<4>[ 3422.833787] alloc_pages_noprof (mm/mempolicy.c:2345) 
<4>[ 3422.833800] pte_alloc_one (./include/asm-generic/pgalloc.h:71 arch/x86/mm/pgtable.c:33) 
<4>[ 3422.833812] __do_fault (mm/memory.c:4650 (discriminator 1)) 
<4>[ 3422.833825] do_fault (mm/memory.c:5091 mm/memory.c:5193) 
<4>[ 3422.833836] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.833850] __handle_mm_fault (mm/memory.c:3947 mm/memory.c:5521 mm/memory.c:5664) 
<4>[ 3422.833859] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.833869] ? lock_release (./include/trace/events/lock.h:69 (discriminator 2) kernel/locking/lockdep.c:5770 (discriminator 2)) 
<4>[ 3422.833890] handle_mm_fault (mm/memory.c:5832) 
<4>[ 3422.833903] do_user_addr_fault (arch/x86/mm/fault.c:1389) 
<4>[ 3422.833919] exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539) 
<4>[ 3422.833933] asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623) 
<4>[ 3422.833945] RIP: 0010:rep_stos_alternative (arch/x86/lib/clear_page_64.S:96) 
<4>[ 3422.833958] Code: ff c7 48 ff c9 75 f6 e9 ce fd 0c 00 48 89 07 48 83 c7 08 83 e9 08 74 ef 83 f9 08 73 ef eb de 66 66 2e 0f 1f 84 00 00 00 00 00 <48> 89 07 48 89 47 08 48 89 47 10 48 89 47 18 48 89 47 20 48 89 47
All code
========
   0:	ff c7                	inc    %edi
   2:	48 ff c9             	dec    %rcx
   5:	75 f6                	jne    0xfffffffffffffffd
   7:	e9 ce fd 0c 00       	jmp    0xcfdda
   c:	48 89 07             	mov    %rax,(%rdi)
   f:	48 83 c7 08          	add    $0x8,%rdi
  13:	83 e9 08             	sub    $0x8,%ecx
  16:	74 ef                	je     0x7
  18:	83 f9 08             	cmp    $0x8,%ecx
  1b:	73 ef                	jae    0xc
  1d:	eb de                	jmp    0xfffffffffffffffd
  1f:	66 66 2e 0f 1f 84 00 	data16 cs nopw 0x0(%rax,%rax,1)
  26:	00 00 00 00 
  2a:*	48 89 07             	mov    %rax,(%rdi)		<-- trapping instruction
  2d:	48 89 47 08          	mov    %rax,0x8(%rdi)
  31:	48 89 47 10          	mov    %rax,0x10(%rdi)
  35:	48 89 47 18          	mov    %rax,0x18(%rdi)
  39:	48 89 47 20          	mov    %rax,0x20(%rdi)
  3d:	48                   	rex.W
  3e:	89                   	.byte 0x89
  3f:	47                   	rex.RXB

Code starting with the faulting instruction
===========================================
   0:	48 89 07             	mov    %rax,(%rdi)
   3:	48 89 47 08          	mov    %rax,0x8(%rdi)
   7:	48 89 47 10          	mov    %rax,0x10(%rdi)
   b:	48 89 47 18          	mov    %rax,0x18(%rdi)
   f:	48 89 47 20          	mov    %rax,0x20(%rdi)
  13:	48                   	rex.W
  14:	89                   	.byte 0x89
  15:	47                   	rex.RXB
<4>[ 3422.833991] RSP: 0018:ffffb4c842f47d00 EFLAGS: 00050202
<4>[ 3422.834004] RAX: 0000000000000000 RBX: 00007f33d0ab8104 RCX: 0000000000000efc
<4>[ 3422.834019] RDX: 00007f33d0ab5360 RSI: 00000000000000a5 RDI: 00007f33d0ab8104
<4>[ 3422.834033] RBP: ffffb4c842f47d40 R08: 00007f33d0ab5000 R09: 0000000000000000
<4>[ 3422.834047] R10: 0000000000000000 R11: 0000000000000000 R12: ffff95d3d87764a8
<4>[ 3422.834061] R13: 0000000000000003 R14: 00007f33d0ab82d8 R15: 0000000000000104
<4>[ 3422.834085] ? elf_load (./arch/x86/include/asm/smap.h:33 ./arch/x86/include/asm/uaccess_64.h:181 ./arch/x86/include/asm/uaccess_64.h:189 fs/binfmt_elf.c:125 fs/binfmt_elf.c:421) 
<4>[ 3422.834100] load_elf_binary (fs/binfmt_elf.c:679 (discriminator 2) fs/binfmt_elf.c:1235 (discriminator 2)) 
<4>[ 3422.834123] bprm_execve (fs/exec.c:1829 fs/exec.c:1869 fs/exec.c:1920 fs/exec.c:1896) 
<4>[ 3422.834140] do_execveat_common.isra.0 (fs/exec.c:2027) 
<4>[ 3422.834156] __x64_sys_execve (fs/exec.c:2172) 
<4>[ 3422.834168] x64_sys_call (arch/x86/entry/syscall_64.c:36) 
<4>[ 3422.834181] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) 
<4>[ 3422.834191] ? exc_page_fault (arch/x86/mm/fault.c:1543) 
<4>[ 3422.834205] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) 
<4>[ 3422.834217] RIP: 0033:0x7fb0476eef3b
<4>[ 3422.834231] Code: Unable to access opcode bytes at 0x7fb0476eef11.

Code starting with the faulting instruction
===========================================
<4>[ 3422.834243] RSP: 002b:00007fff1d690918 EFLAGS: 00000246 ORIG_RAX: 000000000000003b
<4>[ 3422.834260] RAX: ffffffffffffffda RBX: 00005592d2d6b550 RCX: 00007fb0476eef3b
<4>[ 3422.834274] RDX: 00005592d2d6ae18 RSI: 00005592d2d6aa30 RDI: 00005592d2d6b550
<4>[ 3422.834288] RBP: 00007fff1d6909f0 R08: 00000000000007f0 R09: 00000000000003f0
<4>[ 3422.834302] R10: 00007fb047803ac0 R11: 0000000000000246 R12: 00005592d2d6aa30
<4>[ 3422.834316] R13: 00005592a5e90004 R14: 0000000000000002 R15: 00005592d2d69120
<4>[ 3422.834341]  </TASK>
<4>[ 3422.834348] Mem-Info:
<4>[ 3422.834356] active_anon:595 inactive_anon:1731 isolated_anon:0
<4>[ 3422.834356]  active_file:1302 inactive_file:1550 isolated_file:0
<4>[ 3422.834356]  unevictable:52 dirty:0 writeback:0
<4>[ 3422.834356]  slab_reclaimable:16140 slab_unreclaimable:3065596
<4>[ 3422.834356]  mapped:1388 shmem:150 pagetables:4070
<4>[ 3422.834356]  sec_pagetables:0 bounce:0
<4>[ 3422.834356]  kernel_misc_reclaimable:0
<4>[ 3422.834356]  free:61244 free_pcp:252 free_cma:0
<4>[ 3422.834424] Node 0 active_anon:2380kB inactive_anon:6924kB active_file:5208kB inactive_file:6200kB unevictable:208kB isolated(anon):0kB isolated(file):0kB mapped:5552kB dirty:0kB writeback:0kB shmem:600kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:0kB writeback_tmp:0kB kernel_stack:10352kB pagetables:16280kB sec_pagetables:0kB all_unreclaimable? no
<4>[ 3422.834478] Node 0 DMA free:9612kB boost:0kB min:60kB low:72kB high:84kB reserved_highatomic:0KB active_anon:100kB inactive_anon:24kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15996kB managed:15368kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
<4>[ 3422.834527] lowmem_reserve[]: 0 2904 15838 0 0
<4>[ 3422.834548] Node 0 DMA32 free:78296kB boost:0kB min:12380kB low:15472kB high:18564kB reserved_highatomic:30720KB active_anon:88kB inactive_anon:60kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3064752kB managed:2998772kB mlocked:0kB bounce:0kB free_pcp:984kB local_pcp:0kB free_cma:0kB
<4>[ 3422.834599] lowmem_reserve[]: 0 0 12933 0 0
<4>[ 3422.834618] Node 0 Normal free:157068kB boost:73728kB min:128864kB low:142648kB high:156432kB reserved_highatomic:49152KB active_anon:2364kB inactive_anon:6880kB active_file:5232kB inactive_file:6896kB unevictable:208kB writepending:0kB present:13618688kB managed:13251192kB mlocked:68kB bounce:0kB free_pcp:92kB local_pcp:0kB free_cma:0kB
<4>[ 3422.834672] lowmem_reserve[]: 0 0 0 0 0
<4>[ 3422.834691] Node 0 DMA: 16*4kB (M) 10*8kB (UM) 8*16kB (M) 10*32kB (M) 5*64kB (M) 6*128kB (M) 5*256kB (UM) 1*512kB (M) 2*1024kB (UM) 2*2048kB (M) 0*4096kB = 9616kB
<4>[ 3422.834758] Node 0 DMA32: 75*4kB (U) 87*8kB (UM) 84*16kB (UM) 487*32kB (UM) 401*64kB (UM) 272*128kB (UM) 1*256kB (U) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 78660kB
<4>[ 3422.834820] Node 0 Normal: 3039*4kB (UME) 2320*8kB (UME) 1449*16kB (UME) 1152*32kB (UME) 574*64kB (UME) 232*128kB (UME) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 157196kB
<4>[ 3422.834881] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
<4>[ 3422.834899] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
<4>[ 3422.834916] 4205 total pagecache pages
<4>[ 3422.834925] 863 pages in swap cache
<4>[ 3422.834934] Free swap  = 14962428kB
<4>[ 3422.834943] Total swap = 16777212kB
<4>[ 3422.834952] 4174859 pages RAM
<4>[ 3422.834960] 0 pages HighMem/MovableOnly
<4>[ 3422.834969] 108526 pages reserved
<4>[ 3422.834977] 0 pages hwpoisoned
<4>[ 3422.834986] Unreclaimable slab info:
<5>[ 3422.835469] kmalloc-rnd-06-8k total: 456 MiB active: 456 MiB
<5>[ 3422.835471] kmalloc-rnd-11-1k total: 82.1 MiB active: 82.1 MiB
<5>[ 3422.835472] kernfs_node_cache total: 11.6 MiB active: 11.6 MiB
<5>[ 3422.835474] page->ptl         total: 7.29 MiB active: 2.68 MiB
<5>[ 3422.835475] kmalloc-rnd-10-512 total: 6.84 MiB active: 6.84 MiB
<5>[ 3422.835476] task_struct       total: 6.84 MiB active: 6.84 MiB
<5>[ 3422.835478] shmem_inode_cache total: 6.29 MiB active: 6.29 MiB
<5>[ 3422.835479] vm_area_struct    total: 3.10 MiB active: 2.94 MiB
<5>[ 3422.835480] vma_lock          total: 2.86 MiB active: 2.71 MiB
<5>[ 3422.835482] filp              total: 2.63 MiB active: 2.46 MiB
<5>[ 3422.835483]
<4>[ 3422.835571] Shrinkers:
<1>[ 3422.835608] BUG: kernel NULL pointer dereference, address: 0000000000000000
<1>[ 3422.835619] #PF: supervisor read access in kernel mode
<1>[ 3422.835629] #PF: error_code(0x0000) - not-present page
<6>[ 3422.835638] PGD 12c909067 P4D 12c909067 PUD 12c90a067 PMD 0
<4>[ 3422.835654] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
<4>[ 3422.835664] CPU: 9 UID: 1000 PID: 81403 Comm: gcc Tainted: G        W   E      6.11.0-rc4-g2p #57
<4>[ 3422.835680] Tainted: [W]=WARN, [E]=UNSIGNED_MODULE
<4>[ 3422.835689] Hardware name: To Be Filled By O.E.M. X570 Phantom Gaming 4/X570 Phantom Gaming 4, BIOS P5.61 02/22/2024
<4>[ 3422.835705] RIP: 0010:strlen (lib/string.c:402 (discriminator 1)) 
<4>[ 3422.835714] Code: f7 75 ec 31 c0 31 d2 31 f6 31 ff e9 56 e2 0d 00 48 89 f8 31 d2 31 f6 31 ff e9 48 e2 0d 00 0f 1f 84 00 00 00 00 00 f3 0f 1e fa <80> 3f 00 74 16 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 31 ff
All code
========
   0:	f7 75 ec             	divl   -0x14(%rbp)
   3:	31 c0                	xor    %eax,%eax
   5:	31 d2                	xor    %edx,%edx
   7:	31 f6                	xor    %esi,%esi
   9:	31 ff                	xor    %edi,%edi
   b:	e9 56 e2 0d 00       	jmp    0xde266
  10:	48 89 f8             	mov    %rdi,%rax
  13:	31 d2                	xor    %edx,%edx
  15:	31 f6                	xor    %esi,%esi
  17:	31 ff                	xor    %edi,%edi
  19:	e9 48 e2 0d 00       	jmp    0xde266
  1e:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
  25:	00 
  26:	f3 0f 1e fa          	endbr64
  2a:*	80 3f 00             	cmpb   $0x0,(%rdi)		<-- trapping instruction
  2d:	74 16                	je     0x45
  2f:	48 89 f8             	mov    %rdi,%rax
  32:	48 83 c0 01          	add    $0x1,%rax
  36:	80 38 00             	cmpb   $0x0,(%rax)
  39:	75 f7                	jne    0x32
  3b:	48 29 f8             	sub    %rdi,%rax
  3e:	31 ff                	xor    %edi,%edi

Code starting with the faulting instruction
===========================================
   0:	80 3f 00             	cmpb   $0x0,(%rdi)
   3:	74 16                	je     0x1b
   5:	48 89 f8             	mov    %rdi,%rax
   8:	48 83 c0 01          	add    $0x1,%rax
   c:	80 38 00             	cmpb   $0x0,(%rax)
   f:	75 f7                	jne    0x8
  11:	48 29 f8             	sub    %rdi,%rax
  14:	31 ff                	xor    %edi,%edi
<4>[ 3422.835741] RSP: 0018:ffffb4c842f475a8 EFLAGS: 00010246
<4>[ 3422.835751] RAX: 0000000000000000 RBX: ffffb4c842f47770 RCX: fffffffffffffffe
<4>[ 3422.835763] RDX: ffffb4c842f47680 RSI: 0000000000000000 RDI: 0000000000000000
<4>[ 3422.835775] RBP: ffffb4c842f475c8 R08: 0000000000000000 R09: 0000000000000000
<4>[ 3422.835786] R10: 0000000000000000 R11: 0000000000000000 R12: ffffb4c842f47770
<4>[ 3422.835798] R13: 0000000000000000 R14: 20c49ba5e353f7cf R15: 0000000000000008
<4>[ 3422.835809] FS:  0000000000000000(0000) GS:ffff95d6ee600000(0000) knlGS:0000000000000000
<4>[ 3422.835823] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 3422.835833] CR2: 0000000000000000 CR3: 000000021ba66000 CR4: 0000000000350ef0
<4>[ 3422.835845] Call Trace:
<4>[ 3422.835851]  <TASK>
<4>[ 3422.835857] ? show_regs (arch/x86/kernel/dumpstack.c:479) 
<4>[ 3422.835867] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434) 
<4>[ 3422.835876] ? page_fault_oops (arch/x86/mm/fault.c:715) 
<4>[ 3422.835893] ? do_user_addr_fault (arch/x86/mm/fault.c:1236) 
<4>[ 3422.835906] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539) 
<4>[ 3422.835918] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623) 
<4>[ 3422.835934] ? strlen (lib/string.c:402 (discriminator 1)) 
<4>[ 3422.835944] ? seq_buf_puts (lib/seq_buf.c:186) 
<4>[ 3422.835954] shrinker_to_text (mm/shrinker.c:829) 
<4>[ 3422.835967] shrinkers_to_text (mm/shrinker.c:897 (discriminator 1)) 
<4>[ 3422.835976] ? prb_read_valid (kernel/printk/printk_ringbuffer.c:2183) 
<4>[ 3422.835985] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.835995] ? console_unlock (kernel/printk/printk.c:3137 (discriminator 1)) 
<4>[ 3422.836018] __show_mem (./include/linux/seq_buf.h:100 mm/show_mem.c:490) 
<4>[ 3422.836030] dump_header (mm/oom_kill.c:445 (discriminator 1)) 
<4>[ 3422.836040] oom_kill_process (mm/oom_kill.c:424 mm/oom_kill.c:1013) 
<4>[ 3422.836049] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.836061] out_of_memory (mm/oom_kill.c:1152) 
<4>[ 3422.836073] __alloc_pages_noprof (mm/page_alloc.c:3609 mm/page_alloc.c:4371 mm/page_alloc.c:4708) 
<4>[ 3422.836086] ? lock_acquire (./include/trace/events/lock.h:24 (discriminator 2) kernel/locking/lockdep.c:5730 (discriminator 2)) 
<4>[ 3422.836105] alloc_pages_mpol_noprof (mm/mempolicy.c:2265 (discriminator 1)) 
<4>[ 3422.836116] ? lock_acquire (./include/trace/events/lock.h:24 (discriminator 2) kernel/locking/lockdep.c:5730 (discriminator 2)) 
<4>[ 3422.836127] alloc_pages_noprof (mm/mempolicy.c:2345) 
<4>[ 3422.836138] pte_alloc_one (./include/asm-generic/pgalloc.h:71 arch/x86/mm/pgtable.c:33) 
<4>[ 3422.836147] __do_fault (mm/memory.c:4650 (discriminator 1)) 
<4>[ 3422.836158] do_fault (mm/memory.c:5091 mm/memory.c:5193) 
<4>[ 3422.836166] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.836178] __handle_mm_fault (mm/memory.c:3947 mm/memory.c:5521 mm/memory.c:5664) 
<4>[ 3422.836187] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.836196] ? lock_release (./include/trace/events/lock.h:69 (discriminator 2) kernel/locking/lockdep.c:5770 (discriminator 2)) 
<4>[ 3422.836216] handle_mm_fault (mm/memory.c:5832) 
<4>[ 3422.836227] do_user_addr_fault (arch/x86/mm/fault.c:1389) 
<4>[ 3422.836241] exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539) 
<4>[ 3422.836252] asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623) 
<4>[ 3422.836261] RIP: 0010:rep_stos_alternative (arch/x86/lib/clear_page_64.S:96) 
<4>[ 3422.836271] Code: ff c7 48 ff c9 75 f6 e9 ce fd 0c 00 48 89 07 48 83 c7 08 83 e9 08 74 ef 83 f9 08 73 ef eb de 66 66 2e 0f 1f 84 00 00 00 00 00 <48> 89 07 48 89 47 08 48 89 47 10 48 89 47 18 48 89 47 20 48 89 47
All code
========
   0:	ff c7                	inc    %edi
   2:	48 ff c9             	dec    %rcx
   5:	75 f6                	jne    0xfffffffffffffffd
   7:	e9 ce fd 0c 00       	jmp    0xcfdda
   c:	48 89 07             	mov    %rax,(%rdi)
   f:	48 83 c7 08          	add    $0x8,%rdi
  13:	83 e9 08             	sub    $0x8,%ecx
  16:	74 ef                	je     0x7
  18:	83 f9 08             	cmp    $0x8,%ecx
  1b:	73 ef                	jae    0xc
  1d:	eb de                	jmp    0xfffffffffffffffd
  1f:	66 66 2e 0f 1f 84 00 	data16 cs nopw 0x0(%rax,%rax,1)
  26:	00 00 00 00 
  2a:*	48 89 07             	mov    %rax,(%rdi)		<-- trapping instruction
  2d:	48 89 47 08          	mov    %rax,0x8(%rdi)
  31:	48 89 47 10          	mov    %rax,0x10(%rdi)
  35:	48 89 47 18          	mov    %rax,0x18(%rdi)
  39:	48 89 47 20          	mov    %rax,0x20(%rdi)
  3d:	48                   	rex.W
  3e:	89                   	.byte 0x89
  3f:	47                   	rex.RXB

Code starting with the faulting instruction
===========================================
   0:	48 89 07             	mov    %rax,(%rdi)
   3:	48 89 47 08          	mov    %rax,0x8(%rdi)
   7:	48 89 47 10          	mov    %rax,0x10(%rdi)
   b:	48 89 47 18          	mov    %rax,0x18(%rdi)
   f:	48 89 47 20          	mov    %rax,0x20(%rdi)
  13:	48                   	rex.W
  14:	89                   	.byte 0x89
  15:	47                   	rex.RXB
<4>[ 3422.836298] RSP: 0018:ffffb4c842f47d00 EFLAGS: 00050202
<4>[ 3422.836308] RAX: 0000000000000000 RBX: 00007f33d0ab8104 RCX: 0000000000000efc
<4>[ 3422.836320] RDX: 00007f33d0ab5360 RSI: 00000000000000a5 RDI: 00007f33d0ab8104
<4>[ 3422.836331] RBP: ffffb4c842f47d40 R08: 00007f33d0ab5000 R09: 0000000000000000
<4>[ 3422.836343] R10: 0000000000000000 R11: 0000000000000000 R12: ffff95d3d87764a8
<4>[ 3422.836354] R13: 0000000000000003 R14: 00007f33d0ab82d8 R15: 0000000000000104
<4>[ 3422.836374] ? elf_load (./arch/x86/include/asm/smap.h:33 ./arch/x86/include/asm/uaccess_64.h:181 ./arch/x86/include/asm/uaccess_64.h:189 fs/binfmt_elf.c:125 fs/binfmt_elf.c:421) 
<4>[ 3422.836386] load_elf_binary (fs/binfmt_elf.c:679 (discriminator 2) fs/binfmt_elf.c:1235 (discriminator 2)) 
<4>[ 3422.836405] bprm_execve (fs/exec.c:1829 fs/exec.c:1869 fs/exec.c:1920 fs/exec.c:1896) 
<4>[ 3422.836418] do_execveat_common.isra.0 (fs/exec.c:2027) 
<4>[ 3422.836431] __x64_sys_execve (fs/exec.c:2172) 
<4>[ 3422.836441] x64_sys_call (arch/x86/entry/syscall_64.c:36) 
<4>[ 3422.836451] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) 
<4>[ 3422.836459] ? exc_page_fault (arch/x86/mm/fault.c:1543) 
<4>[ 3422.836471] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) 
<4>[ 3422.836481] RIP: 0033:0x7fb0476eef3b
<4>[ 3422.836490] Code: Unable to access opcode bytes at 0x7fb0476eef11.

Code starting with the faulting instruction
===========================================
<4>[ 3422.836501] RSP: 002b:00007fff1d690918 EFLAGS: 00000246 ORIG_RAX: 000000000000003b
<4>[ 3422.836515] RAX: ffffffffffffffda RBX: 00005592d2d6b550 RCX: 00007fb0476eef3b
<4>[ 3422.836526] RDX: 00005592d2d6ae18 RSI: 00005592d2d6aa30 RDI: 00005592d2d6b550
<4>[ 3422.836538] RBP: 00007fff1d6909f0 R08: 00000000000007f0 R09: 00000000000003f0
<4>[ 3422.836549] R10: 00007fb047803ac0 R11: 0000000000000246 R12: 00005592d2d6aa30
<4>[ 3422.836561] R13: 00005592a5e90004 R14: 0000000000000002 R15: 00005592d2d69120
<4>[ 3422.836581]  </TASK>
<4>[ 3422.836586] Modules linked in: cpuid(E) simpledrm(E) drm_shmem_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) drm_kms_helper(E) fb(E) input_leds(E) snd_seq_dummy(E) snd_hrtimer(E) xfs(E) dm_crypt(E) cmac(E) ccm(E) kyber_iosched(E) nls_utf8(E) wireguard(E) curve25519_x86_64(E) libcurve25519_generic(E) libchacha20poly1305(E) chacha_x86_64(E) poly1305_x86_64(E) ip6_udp_tunnel(E) udp_tunnel(E) ip6t_REJECT(E) nf_reject_ipv6(E) xt_hl(E) ip6t_rt(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_multiport(E) xt_recent(E) nft_limit(E) xt_limit(E) xt_addrtype(E) xt_tcpudp(E) xt_conntrack(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nft_compat(E) nf_tables(E) binfmt_misc(E) btrfs(E) blake2b_generic(E) nls_iso8859_1(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_scodec_component(E) intel_rapl_msr(E) snd_hda_intel(E) snd_intel_dspcfg(E) intel_rapl_common(E) snd_hda_codec(E) kvm_amd(E) snd_hwdep(E) snd_hda_core(E) kvm(E) snd_pcm(E) iwlmvm(E) snd_seq(E) snd_seq_device(E) rapl(E)
<4>[ 3422.836679]  wmi_bmof(E) snd_timer(E) mac80211(E) snd(E) soundcore(E) libarc4(E) i2c_piix4(E) k10temp(E) i2c_smbus(E) iwlwifi(E) cfg80211(E) wmi(E) mac_hid(E) auth_rpcgss(E) drm(E) sunrpc(E) drm_panel_orientation_quirks(E) efi_pstore(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) bcache(E) crct10dif_pclmul(E) crc32_pclmul(E) polyval_clmulni(E) bridge(E) polyval_generic(E) ghash_clmulni_intel(E) pata_acpi(E) stp(E) llc(E) sha512_ssse3(E) nvme(E) sha256_ssse3(E) ahci(E) igb(E) sha1_ssse3(E) xhci_pci(E) i2c_algo_bit(E) xhci_pci_renesas(E) nvme_core(E) ccp(E) libahci(E) dca(E) pata_jmicron(E) nvme_auth(E) dm_mirror(E) dm_region_hash(E) dm_log(E) msr(E) autofs4(E) aesni_intel(E) crypto_simd(E) cryptd(E)
<4>[ 3422.836948] CR2: 0000000000000000
<4>[ 3422.836956] ---[ end trace 0000000000000000 ]---
<4>[ 3422.845269] workqueue: cache_lookup [bcache] hogged CPU for >10000us 7 times, consider switching to WQ_UNBOUND
<4>[ 3422.990630] RIP: 0010:strlen (lib/string.c:402 (discriminator 1)) 
<4>[ 3422.990651] Code: f7 75 ec 31 c0 31 d2 31 f6 31 ff e9 56 e2 0d 00 48 89 f8 31 d2 31 f6 31 ff e9 48 e2 0d 00 0f 1f 84 00 00 00 00 00 f3 0f 1e fa <80> 3f 00 74 16 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 31 ff
All code
========
   0:	f7 75 ec             	divl   -0x14(%rbp)
   3:	31 c0                	xor    %eax,%eax
   5:	31 d2                	xor    %edx,%edx
   7:	31 f6                	xor    %esi,%esi
   9:	31 ff                	xor    %edi,%edi
   b:	e9 56 e2 0d 00       	jmp    0xde266
  10:	48 89 f8             	mov    %rdi,%rax
  13:	31 d2                	xor    %edx,%edx
  15:	31 f6                	xor    %esi,%esi
  17:	31 ff                	xor    %edi,%edi
  19:	e9 48 e2 0d 00       	jmp    0xde266
  1e:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
  25:	00 
  26:	f3 0f 1e fa          	endbr64
  2a:*	80 3f 00             	cmpb   $0x0,(%rdi)		<-- trapping instruction
  2d:	74 16                	je     0x45
  2f:	48 89 f8             	mov    %rdi,%rax
  32:	48 83 c0 01          	add    $0x1,%rax
  36:	80 38 00             	cmpb   $0x0,(%rax)
  39:	75 f7                	jne    0x32
  3b:	48 29 f8             	sub    %rdi,%rax
  3e:	31 ff                	xor    %edi,%edi

Code starting with the faulting instruction
===========================================
   0:	80 3f 00             	cmpb   $0x0,(%rdi)
   3:	74 16                	je     0x1b
   5:	48 89 f8             	mov    %rdi,%rax
   8:	48 83 c0 01          	add    $0x1,%rax
   c:	80 38 00             	cmpb   $0x0,(%rax)
   f:	75 f7                	jne    0x8
  11:	48 29 f8             	sub    %rdi,%rax
  14:	31 ff                	xor    %edi,%edi
<4>[ 3422.990678] RSP: 0018:ffffb4c842f475a8 EFLAGS: 00010246
<4>[ 3422.990690] RAX: 0000000000000000 RBX: ffffb4c842f47770 RCX: fffffffffffffffe
<4>[ 3422.990702] RDX: ffffb4c842f47680 RSI: 0000000000000000 RDI: 0000000000000000
<4>[ 3422.990714] RBP: ffffb4c842f475c8 R08: 0000000000000000 R09: 0000000000000000
<4>[ 3422.990725] R10: 0000000000000000 R11: 0000000000000000 R12: ffffb4c842f47770
<4>[ 3422.990737] R13: 0000000000000000 R14: 20c49ba5e353f7cf R15: 0000000000000008
<4>[ 3422.990749] FS:  0000000000000000(0000) GS:ffff95d6ee600000(0000) knlGS:0000000000000000
<4>[ 3422.990763] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 3422.990773] CR2: 00007fb0476eef11 CR3: 000000021ba66000 CR4: 0000000000350ef0
<0>[ 3422.990785] Kernel panic - not syncing: Fatal exception
<0>[ 3422.991792] Kernel Offset: 0x16400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
<0>[ 3423.147299] Rebooting in 45 seconds..
@g2p g2p changed the title Crash in shrinker_to_text -> strlen Kernel panic in shrinker_to_text -> strlen Aug 24, 2024
@jpsollie
Copy link
Contributor

I fail to see what the link with bcachefs is ... The mm guys may be interested here, as RC4 is a testing kernel.
try bugzilla.kernel.org

@g2p
Copy link
Contributor Author

g2p commented Aug 25, 2024

This is on bcachefs-testing which added changes to the oom hook to display a top 10 of shrinkers. I suspect some of the top ten is empty with some configs, causing the crash.

This does not happen on rc4 without those commits near the top of bcachefs-testing.

@g2p
Copy link
Contributor Author

g2p commented Oct 14, 2024

Closing since those patches aren't in bcachefs-testing at the moment. There was some review feedback that might be relevant to the crash: https://lore.kernel.org/all/Zs6aRZrjqPXQue6r@dread.disaster.area/

@g2p g2p closed this as completed Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants