-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash on scrub #8664
Comments
I can't be entirely certain based on the screenshot. But I believe this is due to a change in the latest 5.0 kernel related to cgroups. I'd suggest rolling back your kernel to 4.19 while we sort this out. |
Still seeing similar behavior on 4.18 (tried 4.19 as well), the only difference now is that dmesg isn't spitting anything out. I'll go back to 5.0 kernel and try and get better logs, but is there anything in particular of good use to look at? My knowledge on these subjects is not even a fraction of you guys/gals so please bear with me! |
I should also point out that booting into another to system and accessing this pool as a non-root partition works fine. I need to double check which kernel was running in that test... Will try and get that asap. Also, scrubbing a different pool from the same 5.0 kernel works fine too. It's weird that it seems to just be this combo |
Then are you saying everything appears to work fine using the 5.0.9-1-ck kernel on Arch Linux as long as you're not using it as a root pool? That would be strange.
If possible it would be great to get the first stack dump from the console logs. |
Hey behlendorf That seems to be the case that I have... I can scrub it fine with whatever kernel I throw at it, unless it's the root. I've been tempted to create a new pool from this dataset and see if the problem moves with the pool, but for the time being, this seems like a good test case for a potential issue... not sure though? I was finally able to get more data from my setup using a serial cable to dump things out... I hope this helps further. I should've mentioned earlier... thanks for you (and all the developers) for all you do... I love the project! Thanks for the time at looking at this!
|
Hello, I have the same issue but with different kernel and zfs versions. I run a gentoo. Until 2 weeks ago, my kernel was 4.14.52 with ZoL 0.7.6. My pools were:
2 weeks ago, I slightly upgraded my hardware and I updated kernel and ZoL. Now, kernel version is 4.19.27 with ZoL 0.7.13. My pools are:
The previous setting worked flawlessly during 1 year nearly without reboot, scrubbing root pool once per week. Since update, I have no problem to scrub boot and data pool but, most of the time, kernel hangs when I scrub root pool and always without any message, I still can switch between tty. Each time, I scrub boot pool and then root pool. After each crash, my mariadb database is corrupted beyond recovery and I had to rollback zfs snapshot to restart it. Yesterday evening, after 1 week uptime, when scrubbing root pool starts, kernel hanged. I rebooted my computer, remove data pool ZIL and cache devices and I was able to scrub root pool without problem. This morning, I re-add data pool ZIL and cache devices and still encounter no problem to scrub root pool. Server activity is quite low. Hardware is Xeon L5640 (6c/12t) with 72GB ECC RAM. At the moment, more than 32GB RAM is free and CPU load average is rougly 0.5. Pool capacities are:
NVMe, SSD and HD temperatures are ok. NVMe are below 47°C during scrub, SSD are below 23°C and HD are below 30°C. No hardware problem seems to exist. Thanks for all your work. |
0.7.5 w/kernel 4.15 . while resuming from s3 sleep, kernel randomly freezes. Any connection here? |
It's difficult to tell right away without seeing any logs. The issue
initially reported seems to do with some issue with memory allocation
during the scrub. Do you have any log info?
…On Thu, May 23, 2019 at 1:07 AM xyzpublic ***@***.***> wrote:
0.7.5 w/kernel 4.15 . while resuming from s3 sleep, kernel randomly
freezes. Any connection here?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#8664>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABSNTYKBG5RY43DZGSHEBU3PWYRBNANCNFSM4HH7QMMQ>
.
|
I also think so as scrubbing root on a freshly booted computer works for me. Regarding my previous post, I confirm the problem. I disable root scrubbing and since then, my server has not crashed. I do not use suspend/hibernate on my server. |
I have the same issue on my 2 servers. Both are using root pools. Pool is newly created. When i run scrub right after reboot, it runs fine. After some uptime issue happens when i run scrub. Here are logs (journalctl -f and dmesg -wT) during scrub issue happens. Hardware is 100% fine |
Does the same behavior occur if you scrub the pool as a non-root pool?... curious to see if this is tied to it being a root fs as well |
I will try to create additional non-rool pool, do some tests and write my results here |
@behlendorf FWIW, I do have an interesting data point for this... not sure if it's helpful. If I initiate the scrub in my initramfs prior to kernel execution, the scrub behaves normally and proceeds ok. From that point on, I can boot in to the kernel and the scrub continues without issue. |
Can you provide specs for your server? CPU and RAM? I've only got 8GB on my machine and to the point that @behlendorf made about cgroups, it seems suspect that maybe I just dont have enough RAM to initiate the scrub in the root environment. I did try downgrading to older kernels without success, but still wondering if memory restrictions are at play here |
This is my test server. Small j4205 with 8Gb of ram. Latest proxmox VE 6.0.5 ZFS 0.8.1 . zfs_arc_max set to 3Gb. Problem began with new kernel. |
Hi, on Proxmox forum there is a long thread now, which is about this kind of issue: In short: |
On the forum wasteground posted a kernel log about the situation: He wrote this:
|
@pongraczi thank you for posting the logs. I believe I see what's causing the reported hangs. As a workaround you can try setting |
@behlendorf thank you, I posted it on the proxmox forum, I guess some guys have such systems to test it soon. |
Just for the record, there are scheduler check before and after on the Proxmox forum and it seems,
One of my system was not affected, even it is a remote backup server via internet, but really slow bandwith. That has only hard disks as vdevs, only the log/zil are on ssd. Please check the Proxmox forum for details, as other users post their configs. |
Example about an affected config: root@lm1:~# zpool status
pool: rpool
state: ONLINE
scan: scrub repaired 0B in 0 days 01:08:43 with 0 errors on Sun Jul 14 01:32:44 2019
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sde2 ONLINE 0 0 0
sdf2 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
ata-Samsung_SSD_850_PRO_1TB_S252NXAG805442N ONLINE 0 0 0
ata-Samsung_SSD_850_PRO_1TB_S252NXAG805106B ONLINE 0 0 0
errors: No known data errors
pool: zbackup
state: ONLINE
scan: scrub repaired 0B in 1 days 03:32:58 with 0 errors on Mon Jul 15 03:57:00 2019
config:
NAME STATE READ WRITE CKSUM
zbackup ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M3UZC1F1 ONLINE 0 0 0
ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M7ZESTA2 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M7ZEST6D ONLINE 0 0 0
ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M6AVE83Y ONLINE 0 0 0
errors: No known data errors The following is the same in before/after changing root@lm1:~# for i in a b c d e f g h; do cat /sys/block/sd$i/queue/scheduler; done
[mq-deadline] none
[mq-deadline] none
[mq-deadline] none
[mq-deadline] none
[mq-deadline] none
[mq-deadline] none
[mq-deadline] none
[mq-deadline] none Before: root@lm1:~# cat /sys/module/zfs/parameters/zfs_vdev_scheduler
noop After root@lm1:~# cat /sys/module/zfs/parameters/zfs_vdev_scheduler
none |
Originally the zfs_vdev_elevator module option was added as a convenience so the requested elevator would be automatically set on the underlying block devices. At the time this was simple because the kernel provided an API function which did exactly this. This API was then removed in the Linux 4.12 kernel which prompted us to add compatibly code to set the elevator via a usermodehelper. While well intentioned this can result in a system hang if the usermodehelper does not return. This is because it may be called with the spa config lock held as a writter. For example via the zfs_ioc_pool_scan() -> spa_scan() -> spa_scan() -> vdev_reopen() callpath. Additionally, there's a reasonable argument that setting the elevator is the resonsiblity of whatever software is being used to configure the system. Therefore, it's proposed that the zfs_vdev_elevator be removed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#8664
Thanks for verifying the proposed workaround. For those interested, this is caused by an interaction between the scrub, which attempted to reopen the devices, and the compatibility code required for the I've opened #9317 which proposes to remove the |
#9321) Originally the zfs_vdev_elevator module option was added as a convenience so the requested elevator would be automatically set on the underlying block devices. At the time this was simple because the kernel provided an API function which did exactly this. This API was then removed in the Linux 4.12 kernel which prompted us to add compatibly code to set the elevator via a usermodehelper. Unfortunately changing the evelator via usermodehelper requires reading some userland binaries, most notably modprobe(8) or sh(1), from a zfs dataset on systems with root-on-zfs. This can deadlock the system if used during the following call path because it may need, if the data is not already cached in the ARC, reading directly from disk while holding the spa config lock as a writer: zfs_ioc_pool_scan() -> spa_scan() -> spa_scan() -> vdev_reopen() -> vdev_elevator_switch() -> call_usermodehelper() While the usermodehelper waits sh(1), modprobe(8) is blocked in the ZIO pipeline trying to read from disk: INFO: task modprobe:2650 blocked for more than 10 seconds. Tainted: P OE 5.2.14 modprobe D 0 2650 206 0x00000000 Call Trace: ? __schedule+0x244/0x5f0 schedule+0x2f/0xa0 cv_wait_common+0x156/0x290 [spl] ? do_wait_intr_irq+0xb0/0xb0 spa_config_enter+0x13b/0x1e0 [zfs] zio_vdev_io_start+0x51d/0x590 [zfs] ? tsd_get_by_thread+0x3b/0x80 [spl] zio_nowait+0x142/0x2f0 [zfs] arc_read+0xb2d/0x19d0 [zfs] ... zpl_iter_read+0xfa/0x170 [zfs] new_sync_read+0x124/0x1b0 vfs_read+0x91/0x140 ksys_read+0x59/0xd0 do_syscall_64+0x4f/0x130 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This commit changes how we use the usermodehelper functionality from synchronous (UMH_WAIT_PROC) to asynchronous (UMH_NO_WAIT) which prevents scrubs, and other vdev_elevator_switch() consumers, from triggering the aforementioned issue. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Issue #8664 Closes #9321
@behlendorf Maybe you could have a look at my logs at https://forum.proxmox.com/threads/proxmox-v6-servers-freeze-zvol-blocked-for-more-than-120s.57765/post-267351 just to confirm your solution fits my case, too? That would make me feel a lot better! Thanks |
Originally the zfs_vdev_elevator module option was added as a convenience so the requested elevator would be automatically set on the underlying block devices. At the time this was simple because the kernel provided an API function which did exactly this. This API was then removed in the Linux 4.12 kernel which prompted us to add compatibly code to set the elevator via a usermodehelper. While well intentioned this can result in a system hang if the usermodehelper does not return. This is because it may be called with the spa config lock held as a writter. For example via the zfs_ioc_pool_scan() -> spa_scan() -> spa_scan() -> vdev_reopen() callpath. Additionally, there's a reasonable argument that setting the elevator is the resonsiblity of whatever software is being used to configure the system. Therefore, it's proposed that the zfs_vdev_elevator be removed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#8664 Issue openzfs#9317
openzfs#9321) Originally the zfs_vdev_elevator module option was added as a convenience so the requested elevator would be automatically set on the underlying block devices. At the time this was simple because the kernel provided an API function which did exactly this. This API was then removed in the Linux 4.12 kernel which prompted us to add compatibly code to set the elevator via a usermodehelper. Unfortunately changing the evelator via usermodehelper requires reading some userland binaries, most notably modprobe(8) or sh(1), from a zfs dataset on systems with root-on-zfs. This can deadlock the system if used during the following call path because it may need, if the data is not already cached in the ARC, reading directly from disk while holding the spa config lock as a writer: zfs_ioc_pool_scan() -> spa_scan() -> spa_scan() -> vdev_reopen() -> vdev_elevator_switch() -> call_usermodehelper() While the usermodehelper waits sh(1), modprobe(8) is blocked in the ZIO pipeline trying to read from disk: INFO: task modprobe:2650 blocked for more than 10 seconds. Tainted: P OE 5.2.14 modprobe D 0 2650 206 0x00000000 Call Trace: ? __schedule+0x244/0x5f0 schedule+0x2f/0xa0 cv_wait_common+0x156/0x290 [spl] ? do_wait_intr_irq+0xb0/0xb0 spa_config_enter+0x13b/0x1e0 [zfs] zio_vdev_io_start+0x51d/0x590 [zfs] ? tsd_get_by_thread+0x3b/0x80 [spl] zio_nowait+0x142/0x2f0 [zfs] arc_read+0xb2d/0x19d0 [zfs] ... zpl_iter_read+0xfa/0x170 [zfs] new_sync_read+0x124/0x1b0 vfs_read+0x91/0x140 ksys_read+0x59/0xd0 do_syscall_64+0x4f/0x130 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This commit changes how we use the usermodehelper functionality from synchronous (UMH_WAIT_PROC) to asynchronous (UMH_NO_WAIT) which prevents scrubs, and other vdev_elevator_switch() consumers, from triggering the aforementioned issue. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Issue openzfs#8664 Closes openzfs#9321
openzfs#9321) Originally the zfs_vdev_elevator module option was added as a convenience so the requested elevator would be automatically set on the underlying block devices. At the time this was simple because the kernel provided an API function which did exactly this. This API was then removed in the Linux 4.12 kernel which prompted us to add compatibly code to set the elevator via a usermodehelper. Unfortunately changing the evelator via usermodehelper requires reading some userland binaries, most notably modprobe(8) or sh(1), from a zfs dataset on systems with root-on-zfs. This can deadlock the system if used during the following call path because it may need, if the data is not already cached in the ARC, reading directly from disk while holding the spa config lock as a writer: zfs_ioc_pool_scan() -> spa_scan() -> spa_scan() -> vdev_reopen() -> vdev_elevator_switch() -> call_usermodehelper() While the usermodehelper waits sh(1), modprobe(8) is blocked in the ZIO pipeline trying to read from disk: INFO: task modprobe:2650 blocked for more than 10 seconds. Tainted: P OE 5.2.14 modprobe D 0 2650 206 0x00000000 Call Trace: ? __schedule+0x244/0x5f0 schedule+0x2f/0xa0 cv_wait_common+0x156/0x290 [spl] ? do_wait_intr_irq+0xb0/0xb0 spa_config_enter+0x13b/0x1e0 [zfs] zio_vdev_io_start+0x51d/0x590 [zfs] ? tsd_get_by_thread+0x3b/0x80 [spl] zio_nowait+0x142/0x2f0 [zfs] arc_read+0xb2d/0x19d0 [zfs] ... zpl_iter_read+0xfa/0x170 [zfs] new_sync_read+0x124/0x1b0 vfs_read+0x91/0x140 ksys_read+0x59/0xd0 do_syscall_64+0x4f/0x130 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This commit changes how we use the usermodehelper functionality from synchronous (UMH_WAIT_PROC) to asynchronous (UMH_NO_WAIT) which prevents scrubs, and other vdev_elevator_switch() consumers, from triggering the aforementioned issue. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Issue openzfs#8664 Closes openzfs#9321
openzfs#9321) Originally the zfs_vdev_elevator module option was added as a convenience so the requested elevator would be automatically set on the underlying block devices. At the time this was simple because the kernel provided an API function which did exactly this. This API was then removed in the Linux 4.12 kernel which prompted us to add compatibly code to set the elevator via a usermodehelper. Unfortunately changing the evelator via usermodehelper requires reading some userland binaries, most notably modprobe(8) or sh(1), from a zfs dataset on systems with root-on-zfs. This can deadlock the system if used during the following call path because it may need, if the data is not already cached in the ARC, reading directly from disk while holding the spa config lock as a writer: zfs_ioc_pool_scan() -> spa_scan() -> spa_scan() -> vdev_reopen() -> vdev_elevator_switch() -> call_usermodehelper() While the usermodehelper waits sh(1), modprobe(8) is blocked in the ZIO pipeline trying to read from disk: INFO: task modprobe:2650 blocked for more than 10 seconds. Tainted: P OE 5.2.14 modprobe D 0 2650 206 0x00000000 Call Trace: ? __schedule+0x244/0x5f0 schedule+0x2f/0xa0 cv_wait_common+0x156/0x290 [spl] ? do_wait_intr_irq+0xb0/0xb0 spa_config_enter+0x13b/0x1e0 [zfs] zio_vdev_io_start+0x51d/0x590 [zfs] ? tsd_get_by_thread+0x3b/0x80 [spl] zio_nowait+0x142/0x2f0 [zfs] arc_read+0xb2d/0x19d0 [zfs] ... zpl_iter_read+0xfa/0x170 [zfs] new_sync_read+0x124/0x1b0 vfs_read+0x91/0x140 ksys_read+0x59/0xd0 do_syscall_64+0x4f/0x130 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This commit changes how we use the usermodehelper functionality from synchronous (UMH_WAIT_PROC) to asynchronous (UMH_NO_WAIT) which prevents scrubs, and other vdev_elevator_switch() consumers, from triggering the aforementioned issue. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Issue openzfs#8664 Closes openzfs#9321
openzfs#9321) Originally the zfs_vdev_elevator module option was added as a convenience so the requested elevator would be automatically set on the underlying block devices. At the time this was simple because the kernel provided an API function which did exactly this. This API was then removed in the Linux 4.12 kernel which prompted us to add compatibly code to set the elevator via a usermodehelper. Unfortunately changing the evelator via usermodehelper requires reading some userland binaries, most notably modprobe(8) or sh(1), from a zfs dataset on systems with root-on-zfs. This can deadlock the system if used during the following call path because it may need, if the data is not already cached in the ARC, reading directly from disk while holding the spa config lock as a writer: zfs_ioc_pool_scan() -> spa_scan() -> spa_scan() -> vdev_reopen() -> vdev_elevator_switch() -> call_usermodehelper() While the usermodehelper waits sh(1), modprobe(8) is blocked in the ZIO pipeline trying to read from disk: INFO: task modprobe:2650 blocked for more than 10 seconds. Tainted: P OE 5.2.14 modprobe D 0 2650 206 0x00000000 Call Trace: ? __schedule+0x244/0x5f0 schedule+0x2f/0xa0 cv_wait_common+0x156/0x290 [spl] ? do_wait_intr_irq+0xb0/0xb0 spa_config_enter+0x13b/0x1e0 [zfs] zio_vdev_io_start+0x51d/0x590 [zfs] ? tsd_get_by_thread+0x3b/0x80 [spl] zio_nowait+0x142/0x2f0 [zfs] arc_read+0xb2d/0x19d0 [zfs] ... zpl_iter_read+0xfa/0x170 [zfs] new_sync_read+0x124/0x1b0 vfs_read+0x91/0x140 ksys_read+0x59/0xd0 do_syscall_64+0x4f/0x130 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This commit changes how we use the usermodehelper functionality from synchronous (UMH_WAIT_PROC) to asynchronous (UMH_NO_WAIT) which prevents scrubs, and other vdev_elevator_switch() consumers, from triggering the aforementioned issue. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Issue openzfs#8664 Closes openzfs#9321
openzfs#9321) Originally the zfs_vdev_elevator module option was added as a convenience so the requested elevator would be automatically set on the underlying block devices. At the time this was simple because the kernel provided an API function which did exactly this. This API was then removed in the Linux 4.12 kernel which prompted us to add compatibly code to set the elevator via a usermodehelper. Unfortunately changing the evelator via usermodehelper requires reading some userland binaries, most notably modprobe(8) or sh(1), from a zfs dataset on systems with root-on-zfs. This can deadlock the system if used during the following call path because it may need, if the data is not already cached in the ARC, reading directly from disk while holding the spa config lock as a writer: zfs_ioc_pool_scan() -> spa_scan() -> spa_scan() -> vdev_reopen() -> vdev_elevator_switch() -> call_usermodehelper() While the usermodehelper waits sh(1), modprobe(8) is blocked in the ZIO pipeline trying to read from disk: INFO: task modprobe:2650 blocked for more than 10 seconds. Tainted: P OE 5.2.14 modprobe D 0 2650 206 0x00000000 Call Trace: ? __schedule+0x244/0x5f0 schedule+0x2f/0xa0 cv_wait_common+0x156/0x290 [spl] ? do_wait_intr_irq+0xb0/0xb0 spa_config_enter+0x13b/0x1e0 [zfs] zio_vdev_io_start+0x51d/0x590 [zfs] ? tsd_get_by_thread+0x3b/0x80 [spl] zio_nowait+0x142/0x2f0 [zfs] arc_read+0xb2d/0x19d0 [zfs] ... zpl_iter_read+0xfa/0x170 [zfs] new_sync_read+0x124/0x1b0 vfs_read+0x91/0x140 ksys_read+0x59/0xd0 do_syscall_64+0x4f/0x130 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This commit changes how we use the usermodehelper functionality from synchronous (UMH_WAIT_PROC) to asynchronous (UMH_NO_WAIT) which prevents scrubs, and other vdev_elevator_switch() consumers, from triggering the aforementioned issue. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Issue openzfs#8664 Closes openzfs#9321
openzfs#9321) Originally the zfs_vdev_elevator module option was added as a convenience so the requested elevator would be automatically set on the underlying block devices. At the time this was simple because the kernel provided an API function which did exactly this. This API was then removed in the Linux 4.12 kernel which prompted us to add compatibly code to set the elevator via a usermodehelper. Unfortunately changing the evelator via usermodehelper requires reading some userland binaries, most notably modprobe(8) or sh(1), from a zfs dataset on systems with root-on-zfs. This can deadlock the system if used during the following call path because it may need, if the data is not already cached in the ARC, reading directly from disk while holding the spa config lock as a writer: zfs_ioc_pool_scan() -> spa_scan() -> spa_scan() -> vdev_reopen() -> vdev_elevator_switch() -> call_usermodehelper() While the usermodehelper waits sh(1), modprobe(8) is blocked in the ZIO pipeline trying to read from disk: INFO: task modprobe:2650 blocked for more than 10 seconds. Tainted: P OE 5.2.14 modprobe D 0 2650 206 0x00000000 Call Trace: ? __schedule+0x244/0x5f0 schedule+0x2f/0xa0 cv_wait_common+0x156/0x290 [spl] ? do_wait_intr_irq+0xb0/0xb0 spa_config_enter+0x13b/0x1e0 [zfs] zio_vdev_io_start+0x51d/0x590 [zfs] ? tsd_get_by_thread+0x3b/0x80 [spl] zio_nowait+0x142/0x2f0 [zfs] arc_read+0xb2d/0x19d0 [zfs] ... zpl_iter_read+0xfa/0x170 [zfs] new_sync_read+0x124/0x1b0 vfs_read+0x91/0x140 ksys_read+0x59/0xd0 do_syscall_64+0x4f/0x130 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This commit changes how we use the usermodehelper functionality from synchronous (UMH_WAIT_PROC) to asynchronous (UMH_NO_WAIT) which prevents scrubs, and other vdev_elevator_switch() consumers, from triggering the aforementioned issue. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Issue openzfs#8664 Closes openzfs#9321
Originally the zfs_vdev_elevator module option was added as a convenience so the requested elevator would be automatically set on the underlying block devices. At the time this was simple because the kernel provided an API function which did exactly this. This API was then removed in the Linux 4.12 kernel which prompted us to add compatibly code to set the elevator via a usermodehelper. While well intentioned this introduced a bug which could cause a system hang, that issue was subsequently fixed by commit 2a0d418. In order to avoid future bugs in this area, and to simplify the code, this functionality is being deprecated. A console warning has been added to notify any existing consumers and the documentation updated accordingly. This option will remain for the lifetime of the 0.8.x series for compatibility but if planned to be phased out of master. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#8664 Issue openzfs#9317
Originally the zfs_vdev_elevator module option was added as a convenience so the requested elevator would be automatically set on the underlying block devices. At the time this was simple because the kernel provided an API function which did exactly this. This API was then removed in the Linux 4.12 kernel which prompted us to add compatibly code to set the elevator via a usermodehelper. While well intentioned this introduced a bug which could cause a system hang, that issue was subsequently fixed by commit 2a0d418. In order to avoid future bugs in this area, and to simplify the code, this functionality is being deprecated. A console warning has been added to notify any existing consumers and the documentation updated accordingly. This option will remain for the lifetime of the 0.8.x series for compatibility but if planned to be phased out of master. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#8664 Issue openzfs#9317
Originally the zfs_vdev_elevator module option was added as a convenience so the requested elevator would be automatically set on the underlying block devices. At the time this was simple because the kernel provided an API function which did exactly this. This API was then removed in the Linux 4.12 kernel which prompted us to add compatibly code to set the elevator via a usermodehelper. While well intentioned this introduced a bug which could cause a system hang, that issue was subsequently fixed by commit 2a0d418. In order to avoid future bugs in this area, and to simplify the code, this functionality is being deprecated. A console warning has been added to notify any existing consumers and the documentation updated accordingly. This option will remain for the lifetime of the 0.8.x series for compatibility but if planned to be phased out of master. Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: loli10K <ezomori.nozomu@gmail.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #8664 Closes #9317
#9321) Originally the zfs_vdev_elevator module option was added as a convenience so the requested elevator would be automatically set on the underlying block devices. At the time this was simple because the kernel provided an API function which did exactly this. This API was then removed in the Linux 4.12 kernel which prompted us to add compatibly code to set the elevator via a usermodehelper. Unfortunately changing the evelator via usermodehelper requires reading some userland binaries, most notably modprobe(8) or sh(1), from a zfs dataset on systems with root-on-zfs. This can deadlock the system if used during the following call path because it may need, if the data is not already cached in the ARC, reading directly from disk while holding the spa config lock as a writer: zfs_ioc_pool_scan() -> spa_scan() -> spa_scan() -> vdev_reopen() -> vdev_elevator_switch() -> call_usermodehelper() While the usermodehelper waits sh(1), modprobe(8) is blocked in the ZIO pipeline trying to read from disk: INFO: task modprobe:2650 blocked for more than 10 seconds. Tainted: P OE 5.2.14 modprobe D 0 2650 206 0x00000000 Call Trace: ? __schedule+0x244/0x5f0 schedule+0x2f/0xa0 cv_wait_common+0x156/0x290 [spl] ? do_wait_intr_irq+0xb0/0xb0 spa_config_enter+0x13b/0x1e0 [zfs] zio_vdev_io_start+0x51d/0x590 [zfs] ? tsd_get_by_thread+0x3b/0x80 [spl] zio_nowait+0x142/0x2f0 [zfs] arc_read+0xb2d/0x19d0 [zfs] ... zpl_iter_read+0xfa/0x170 [zfs] new_sync_read+0x124/0x1b0 vfs_read+0x91/0x140 ksys_read+0x59/0xd0 do_syscall_64+0x4f/0x130 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This commit changes how we use the usermodehelper functionality from synchronous (UMH_WAIT_PROC) to asynchronous (UMH_NO_WAIT) which prevents scrubs, and other vdev_elevator_switch() consumers, from triggering the aforementioned issue. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Issue #8664 Closes #9321
As described in commit f81d5ef the zfs_vdev_elevator module option is being removed. Users who require this functionality should update their systems to set the disk scheduler using a udev rule. Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: loli10K <ezomori.nozomu@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #8664 Closes #9417 Closes #9609
Originally the zfs_vdev_elevator module option was added as a convenience so the requested elevator would be automatically set on the underlying block devices. At the time this was simple because the kernel provided an API function which did exactly this. This API was then removed in the Linux 4.12 kernel which prompted us to add compatibly code to set the elevator via a usermodehelper. While well intentioned this introduced a bug which could cause a system hang, that issue was subsequently fixed by commit 2a0d418. In order to avoid future bugs in this area, and to simplify the code, this functionality is being deprecated. A console warning has been added to notify any existing consumers and the documentation updated accordingly. This option will remain for the lifetime of the 0.8.x series for compatibility but if planned to be phased out of master. Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: loli10K <ezomori.nozomu@gmail.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#8664 Closes openzfs#9317
As described in commit f81d5ef the zfs_vdev_elevator module option is being removed. Users who require this functionality should update their systems to set the disk scheduler using a udev rule. Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: loli10K <ezomori.nozomu@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#8664 Closes openzfs#9417 Closes openzfs#9609
Originally the zfs_vdev_elevator module option was added as a convenience so the requested elevator would be automatically set on the underlying block devices. At the time this was simple because the kernel provided an API function which did exactly this. This API was then removed in the Linux 4.12 kernel which prompted us to add compatibly code to set the elevator via a usermodehelper. While well intentioned this introduced a bug which could cause a system hang, that issue was subsequently fixed by commit 2a0d418. In order to avoid future bugs in this area, and to simplify the code, this functionality is being deprecated. A console warning has been added to notify any existing consumers and the documentation updated accordingly. This option will remain for the lifetime of the 0.8.x series for compatibility but if planned to be phased out of master. Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: loli10K <ezomori.nozomu@gmail.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#8664 Closes openzfs#9317
As described in commit f81d5ef the zfs_vdev_elevator module option is being removed. Users who require this functionality should update their systems to set the disk scheduler using a udev rule. Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: loli10K <ezomori.nozomu@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#8664 Closes openzfs#9417 Closes openzfs#9609
Originally the zfs_vdev_elevator module option was added as a convenience so the requested elevator would be automatically set on the underlying block devices. At the time this was simple because the kernel provided an API function which did exactly this. This API was then removed in the Linux 4.12 kernel which prompted us to add compatibly code to set the elevator via a usermodehelper. While well intentioned this introduced a bug which could cause a system hang, that issue was subsequently fixed by commit 2a0d418. In order to avoid future bugs in this area, and to simplify the code, this functionality is being deprecated. A console warning has been added to notify any existing consumers and the documentation updated accordingly. This option will remain for the lifetime of the 0.8.x series for compatibility but if planned to be phased out of master. Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: loli10K <ezomori.nozomu@gmail.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #8664 Closes #9317
As described in commit f81d5ef the zfs_vdev_elevator module option is being removed. Users who require this functionality should update their systems to set the disk scheduler using a udev rule. Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: loli10K <ezomori.nozomu@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #8664 Closes #9417 Closes #9609
System information
Describe the problem you're observing
I have a ZFS pool running as my root filesystem, and upon starting a scrub on the pool, the system gets a kernel panic and the tty freezes up.
Describe how to reproduce the problem
Execute zfs scrub on the pool
Include any warning/errors/backtraces from the system logs
pool: zroot
state: ONLINE
scan: scrub paused since Tue Apr 23 20:09:21 2019
scrub started on Wed Feb 20 22:25:20 2019
0B scanned, 0B issued, 1.80T total
0B repaired, 0.00% done
config:
I'm trying to get a proper log, but wanted to at least get a picture that I was able to take... the system is not currently setup to save kernel panic info to file, so once I can get more info on that, I will post.
The text was updated successfully, but these errors were encountered: