You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 26, 2020. It is now read-only.
I was compiling clang in a virtual machine with 2GB of RAM and with swap on a 2GB zvol when the system deadlocked. I was testing patches from openzfs/zfs#726, but I do not believe that this is an issue with the patches. My terminal to the VM was lagging due to a rapidly rising memory_throttle_count in /proc/spl/kstat/zfs/arcstats. This issue occurs consistently and I was able to get the following excerpt from the dmesg log as the system started to lockup:
The page allocation failures show that ZONE_DMA is becoming exhausted. I believe that this was caused by the use of PF_MEMALLOC in ./module/spl/spl-kmem.c. As such, I think it is necessary to revisit the solution for that. The RHEL bug tracker has a kernel patch:
It might be best to patch SPL's build system to detect its presence and disable the use of PF_MEMALLOC in situations where it is present. It also seems necessary to push that patch upstream. I plan to confirm that this fixes it, but I see no other way that this could happen.
The text was updated successfully, but these errors were encountered:
I agree, patching the SPL to detect the change sounds reasonable to me. However, the hard work is going to be working with the upstream maintainers to get the patch included. While it does work there the VM folks never completely signed off on it.
I was compiling clang in a virtual machine with 2GB of RAM and with swap on a 2GB zvol when the system deadlocked. I was testing patches from openzfs/zfs#726, but I do not believe that this is an issue with the patches. My terminal to the VM was lagging due to a rapidly rising memory_throttle_count in /proc/spl/kstat/zfs/arcstats. This issue occurs consistently and I was able to get the following excerpt from the dmesg log as the system started to lockup:
z_wr_int/10: page allocation failure: order:0, mode:0x20
Pid: 15298, comm: z_wr_int/10 Tainted: P O 3.2.12-gentoo #8
Call Trace:
[] warn_alloc_failed+0x108/0x11d
[] __alloc_pages_nodemask+0x615/0x669
[] cache_alloc_refill+0x274/0x4b7
[] kmem_cache_alloc+0x59/0x8b
[] scsi_pool_alloc_command+0x27/0x67
[] scsi_host_alloc_command+0x1c/0x68
[] __scsi_get_command+0x15/0x91
[] scsi_get_command+0x37/0xa3
[] scsi_setup_fs_cmnd+0x69/0xbd
[] sd_prep_fn+0x2bc/0x8e4
[] ? get_request+0x263/0x2ef
[] blk_peek_request+0xb8/0x1a6
[] ? part_round_stats+0x4b/0x52
[] scsi_request_fn+0x67/0x3e4
[] __blk_run_queue+0x16/0x18
[] blk_queue_bio+0x257/0x27b
[] generic_make_request+0x97/0xda
[] submit_bio+0xd2/0xdd
[] vdev_cache_stat_fini+0x3da/0x8ba [zfs]
[] vdev_cache_stat_fini+0x50f/0x8ba [zfs]
[] zio_interrupt+0x229/0x23c [zfs]
[] zio_execute+0xf3/0x28d [zfs]
[] vdev_queue_io_done+0xba/0x2d5e [zfs]
[] zio_buf_free+0x337/0x89d [zfs]
[] zio_execute+0xf3/0x28d [zfs]
[] ? default_spin_lock_flags+0x9/0xe
[] __taskq_dispatch+0x792/0x9e4 [spl]
[] ? try_to_wake_up+0x231/0x231
[] ? __taskq_dispatch+0x4bb/0x9e4 [spl]
[] kthread+0x7d/0x85
[] kernel_thread_helper+0x4/0x10
[] ? kthread_worker_fn+0x152/0x152
[] ? gs_change+0x13/0x13
Mem-Info:
DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
CPU 1: hi: 0, btch: 1 usd: 0
CPU 2: hi: 0, btch: 1 usd: 0
CPU 3: hi: 0, btch: 1 usd: 0
CPU 4: hi: 0, btch: 1 usd: 0
CPU 5: hi: 0, btch: 1 usd: 0
DMA32 per-cpu:
CPU 0: hi: 186, btch: 31 usd: 179
CPU 1: hi: 186, btch: 31 usd: 173
CPU 2: hi: 186, btch: 31 usd: 156
CPU 3: hi: 186, btch: 31 usd: 19
CPU 4: hi: 186, btch: 31 usd: 169
CPU 5: hi: 186, btch: 31 usd: 5
active_anon:182187 inactive_anon:82354 isolated_anon:370
active_file:0 inactive_file:0 isolated_file:0
unevictable:0 dirty:0 writeback:1286 unstable:0
free:62 slab_reclaimable:2144 slab_unreclaimable:11575
mapped:1 shmem:0 pagetables:1619 bounce:0
DMA free:0kB min:40kB low:48kB high:60kB active_anon:1524kB inactive_anon:4160kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):116kB isolated(file):0kB present:15684kB mlocked:0kB dirty:0kB writeback:408kB mapped:0kB shmem:0kB slab_reclaimable:16kB slab_unreclaimable:320kB kernel_stack:0kB pagetables:28kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 2004 2004 2004
DMA32 free:248kB min:5708kB low:7132kB high:8560kB active_anon:727224kB inactive_anon:325256kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):1364kB isolated(file):0kB present:2052308kB mlocked:0kB dirty:0kB writeback:4736kB mapped:4kB shmem:0kB slab_reclaimable:8560kB slab_unreclaimable:45980kB kernel_stack:2080kB pagetables:6448kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 0_4kB 0_8kB 0_16kB 0_32kB 0_64kB 0_128kB 0_256kB 0_512kB 0_1024kB 0_2048kB 0_4096kB = 0kB
DMA32: 42_4kB 10_8kB 0_16kB 0_32kB 0_64kB 0_128kB 0_256kB 0_512kB 0_1024kB 0_2048kB 0_4096kB = 248kB
1591 total pagecache pages
1591 pages in swap cache
Swap cache stats: add 30513, delete 28922, find 2394/2964
Free swap = 2009312kB
Total swap = 2097148kB
524269 pages RAM
9931 pages reserved
640 pages shared
491401 pages non-shared
z_wr_int/11: page allocation failure: order:0, mode:0x20
Pid: 15299, comm: z_wr_int/11 Tainted: P O 3.2.12-gentoo #8
Call Trace:
[] warn_alloc_failed+0x108/0x11d
[] __alloc_pages_nodemask+0x615/0x669
[] cache_alloc_refill+0x274/0x4b7
[] kmem_cache_alloc+0x59/0x8b
[] scsi_pool_alloc_command+0x27/0x67
[] scsi_host_alloc_command+0x1c/0x68
[] __scsi_get_command+0x15/0x91
[] scsi_get_command+0x37/0xa3
[] scsi_setup_fs_cmnd+0x69/0xbd
[] sd_prep_fn+0x2bc/0x8e4
[] ? get_request+0x263/0x2ef
[] blk_peek_request+0xb8/0x1a6
[] ? part_round_stats+0x4b/0x52
[] scsi_request_fn+0x67/0x3e4
[] __blk_run_queue+0x16/0x18
[] blk_queue_bio+0x257/0x27b
[] generic_make_request+0x97/0xda
[] submit_bio+0xd2/0xdd
[] vdev_cache_stat_fini+0x3da/0x8ba [zfs]
[] ? avl_insert+0xa6/0xa9 [zavl]
[] vdev_cache_stat_fini+0x50f/0x8ba [zfs]
[] zio_interrupt+0x229/0x23c [zfs]
[] zio_nowait+0x116/0xc9f [zfs]
[] ? wake_up_process+0x10/0x12
[] vdev_queue_io_done+0xa0/0x2d5e [zfs]
[] zio_buf_free+0x337/0x89d [zfs]
[] zio_execute+0xf3/0x28d [zfs]
[] ? default_spin_lock_flags+0x9/0xe
[] __taskq_dispatch+0x792/0x9e4 [spl]
[] ? try_to_wake_up+0x231/0x231
[] ? __taskq_dispatch+0x4bb/0x9e4 [spl]
[] kthread+0x7d/0x85
[] kernel_thread_helper+0x4/0x10
[] ? kthread_worker_fn+0x152/0x152
[] ? gs_change+0x13/0x13
Mem-Info:
DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
CPU 1: hi: 0, btch: 1 usd: 0
CPU 2: hi: 0, btch: 1 usd: 0
CPU 3: hi: 0, btch: 1 usd: 0
CPU 4: hi: 0, btch: 1 usd: 0
CPU 5: hi: 0, btch: 1 usd: 0
DMA32 per-cpu:
CPU 0: hi: 186, btch: 31 usd: 185
CPU 1: hi: 186, btch: 31 usd: 173
CPU 2: hi: 186, btch: 31 usd: 156
CPU 3: hi: 186, btch: 31 usd: 19
CPU 4: hi: 186, btch: 31 usd: 169
CPU 5: hi: 186, btch: 31 usd: 5
active_anon:182187 inactive_anon:82354 isolated_anon:370
active_file:0 inactive_file:0 isolated_file:0
unevictable:0 dirty:0 writeback:1286 unstable:0
free:194 slab_reclaimable:2144 slab_unreclaimable:11443
mapped:1 shmem:0 pagetables:1619 bounce:0
DMA free:32kB min:40kB low:48kB high:60kB active_anon:1524kB inactive_anon:4160kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):116kB isolated(file):0kB present:15684kB mlocked:0kB dirty:0kB writeback:408kB mapped:0kB shmem:0kB slab_reclaimable:16kB slab_unreclaimable:288kB kernel_stack:0kB pagetables:28kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 2004 2004 2004
DMA32 free:744kB min:5708kB low:7132kB high:8560kB active_anon:727224kB inactive_anon:325256kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):1364kB isolated(file):0kB present:2052308kB mlocked:0kB dirty:0kB writeback:4736kB mapped:4kB shmem:0kB slab_reclaimable:8560kB slab_unreclaimable:45484kB kernel_stack:2080kB pagetables:6448kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 1_4kB 2_8kB 1_16kB 0_32kB 0_64kB 0_128kB 0_256kB 0_512kB 0_1024kB 0_2048kB 0_4096kB = 36kB
DMA32: 130_4kB 24_8kB 2_16kB 0_32kB 0_64kB 0_128kB 0_256kB 0_512kB 0_1024kB 0_2048kB 0_4096kB = 744kB
1591 total pagecache pages
1591 pages in swap cache
Swap cache stats: add 30513, delete 28922, find 2394/2964
Free swap = 2009312kB
Total swap = 2097148kB
524269 pages RAM
9931 pages reserved
640 pages shared
491262 pages non-shared
z_wr_int/11: page allocation failure: order:0, mode:0x20
Pid: 15299, comm: z_wr_int/11 Tainted: P O 3.2.12-gentoo #8
Call Trace:
[] warn_alloc_failed+0x108/0x11d
[] __alloc_pages_nodemask+0x615/0x669
[] cache_alloc_refill+0x274/0x4b7
[] kmem_cache_alloc+0x59/0x8b
[] scsi_pool_alloc_command+0x27/0x67
[] scsi_host_alloc_command+0x1c/0x68
[] __scsi_get_command+0x15/0x91
[] scsi_get_command+0x37/0xa3
[] scsi_setup_fs_cmnd+0x69/0xbd
[] sd_prep_fn+0x2bc/0x8e4
[] ? get_request+0x263/0x2ef
[] blk_peek_request+0xb8/0x1a6
[] ? part_round_stats+0x4b/0x52
[] scsi_request_fn+0x67/0x3e4
[] __blk_run_queue+0x16/0x18
[] blk_queue_bio+0x257/0x27b
[] generic_make_request+0x97/0xda
[] submit_bio+0xd2/0xdd
[] vdev_cache_stat_fini+0x3da/0x8ba [zfs]
[] ? avl_insert+0xa6/0xa9 [zavl]
[] vdev_cache_stat_fini+0x50f/0x8ba [zfs]
[] zio_interrupt+0x229/0x23c [zfs]
[] zio_nowait+0x116/0xc9f [zfs]
[] ? wake_up_process+0x10/0x12
[] vdev_queue_io_done+0xa0/0x2d5e [zfs]
[] zio_buf_free+0x337/0x89d [zfs]
[] zio_execute+0xf3/0x28d [zfs]
[] ? __wake_up+0x3f/0x48
[] __taskq_dispatch+0x792/0x9e4 [spl]
[] ? try_to_wake_up+0x231/0x231
[] ? __taskq_dispatch+0x4bb/0x9e4 [spl]
[] kthread+0x7d/0x85
[] kernel_thread_helper+0x4/0x10
[] ? kthread_worker_fn+0x152/0x152
[] ? gs_change+0x13/0x13
Mem-Info:
DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
CPU 1: hi: 0, btch: 1 usd: 0
CPU 2: hi: 0, btch: 1 usd: 0
CPU 3: hi: 0, btch: 1 usd: 0
CPU 4: hi: 0, btch: 1 usd: 0
CPU 5: hi: 0, btch: 1 usd: 0
DMA32 per-cpu:
CPU 0: hi: 186, btch: 31 usd: 185
CPU 1: hi: 186, btch: 31 usd: 173
CPU 2: hi: 186, btch: 31 usd: 156
CPU 3: hi: 186, btch: 31 usd: 19
CPU 4: hi: 186, btch: 31 usd: 169
CPU 5: hi: 186, btch: 31 usd: 5
active_anon:182187 inactive_anon:82354 isolated_anon:370
active_file:0 inactive_file:0 isolated_file:0
unevictable:0 dirty:0 writeback:1286 unstable:0
free:194 slab_reclaimable:2144 slab_unreclaimable:11443
mapped:1 shmem:0 pagetables:1619 bounce:0
DMA free:32kB min:40kB low:48kB high:60kB active_anon:1524kB inactive_anon:4160kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):116kB isolated(file):0kB present:15684kB mlocked:0kB dirty:0kB writeback:408kB mapped:0kB shmem:0kB slab_reclaimable:16kB slab_unreclaimable:288kB kernel_stack:0kB pagetables:28kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 2004 2004 2004
DMA32 free:744kB min:5708kB low:7132kB high:8560kB active_anon:727224kB inactive_anon:325256kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):1364kB isolated(file):0kB present:2052308kB mlocked:0kB dirty:0kB writeback:4736kB mapped:4kB shmem:0kB slab_reclaimable:8560kB slab_unreclaimable:45484kB kernel_stack:2080kB pagetables:6448kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 1_4kB 2_8kB 1_16kB 0_32kB 0_64kB 0_128kB 0_256kB 0_512kB 0_1024kB 0_2048kB 0_4096kB = 36kB
DMA32: 130_4kB 24_8kB 2_16kB 0_32kB 0_64kB 0_128kB 0_256kB 0_512kB 0_1024kB 0_2048kB 0_4096kB = 744kB
1591 total pagecache pages
1591 pages in swap cache
Swap cache stats: add 30513, delete 28922, find 2394/2964
Free swap = 2009312kB
Total swap = 2097148kB
524269 pages RAM
9931 pages reserved
640 pages shared
491262 pages non-shared
The page allocation failures show that ZONE_DMA is becoming exhausted. I believe that this was caused by the use of PF_MEMALLOC in ./module/spl/spl-kmem.c. As such, I think it is necessary to revisit the solution for that. The RHEL bug tracker has a kernel patch:
https://bugzilla.kernel.org/show_bug.cgi?id=30702
It might be best to patch SPL's build system to detect its presence and disable the use of PF_MEMALLOC in situations where it is present. It also seems necessary to push that patch upstream. I plan to confirm that this fixes it, but I see no other way that this could happen.
The text was updated successfully, but these errors were encountered: