Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zpool import lock up with "zavl: loading out-of-tree module taints kernel" and "PANIC: zfs: allocating allocated segment(offset=108513914880 size=131072)" in logs #9353

Closed
tlvu opened this issue Sep 24, 2019 · 7 comments
Labels
Type: Question Issue for discussion

Comments

@tlvu
Copy link

tlvu commented Sep 24, 2019

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 16.04 (Xenial)
Linux Kernel 4.4.0-154-generic
Architecture x86_64
ZFS Version v0.6.5.6-0ubuntu26
SPL Version v0.6.5.6-0ubuntu4

Describe the problem you're observing

zfs import <pool name> just never return. Corresponding /var/log/syslog below.

Tried to completely uninstall zfs (apt remove zfsutils-linux; apt autoremove) and re-install from scratch, same problem. I notice after I re-install zfsutils-linux the zfs kernel module is not loaded automatically, I had to modprode zfs to force load it. Humm now that I think about this, should uninstall, reboot, then re-install?

Tried to boot using an older kernel that used to work, same problem.

Describe how to reproduce the problem

Not sure how to reproduce. Using a Live USB of a newer ubuntu (18.04), I installed zfsutils-linux and I was able to zfs import fine. Then I ran zpool scrub and no problems found.

But if I boot my real system, this is reproducible 100%.

Prior to this event, I have been using ZFS on the same machine without any trouble for more than 2 years straight. I have zfs-auto-snapshot installed and configured ... just in case there is a relationship.

I have ZFS on other Ubuntu machines as well, never seen this problem.

Given that ZFS refuse to work on my system, sending my pools elsewhere and destroy/recreate my pools won't help me, I think.

So I am stuck, don't know how to get ZFS working on my system again. Ubuntu 16.04 is old, I can re-install a newer version but before I do this I am really curious as to how to diagnose this weird problem.

For me, the biggest selling point of ZFS is how solid it protects the data from bit rot and corruptions. But if ZFS suddenly die like this and there is no "easy" way to bring it back, then this solid reputation just lose a few Brownies point here.

Include any warning/errors/backtraces from the system logs

Logs in /var/log/syslog when I execute zfs import:

Sep 23 21:10:15 E6230 kernel: [  318.063545] zavl: loading out-of-tree module taints kernel.
Sep 23 21:10:15 E6230 kernel: [  318.063553] zavl: module license 'CDDL' taints kernel.
Sep 23 21:10:15 E6230 kernel: [  318.063556] Disabling lock debugging due to kernel taint
Sep 23 21:10:15 E6230 systemd-udevd[424]: unknown key 'SYSFS{idVendor}' in /etc/udev/rules.d/60-brother-brscan4-libsane-type1.rules:9
Sep 23 21:10:15 E6230 systemd-udevd[424]: invalid rule '/etc/udev/rules.d/60-brother-brscan4-libsane-type1.rules:9'
Sep 23 21:10:15 E6230 kernel: [  318.089190] SPL: Loaded module v0.6.5.6-0ubuntu4
Sep 23 21:10:15 E6230 kernel: [  318.168546] ZFS: Loaded module v0.6.5.6-0ubuntu26, ZFS pool version 5000, ZFS filesystem version 5
Sep 23 21:10:51 E6230 kernel: [  354.666779] SPL: The /etc/hostid file is not found.
Sep 23 21:10:51 E6230 kernel: [  354.666785] SPL: using hostid 0x00000000
Sep 23 21:10:52 E6230 kernel: [  355.091555] PANIC: zfs: allocating allocated segment(offset=108513914880 size=131072)
Sep 23 21:10:52 E6230 kernel: [  355.091555]
Sep 23 21:10:52 E6230 kernel: [  355.091559] Showing stack for process 28499
Sep 23 21:10:52 E6230 kernel: [  355.091561] CPU: 1 PID: 28499 Comm: z_wr_iss Tainted: P           O    4.4.0-154-generic #181-Ubuntu
Sep 23 21:10:52 E6230 kernel: [  355.091563] Hardware name: Dell Inc. Latitude E6230/0R6V5Y, BIOS A15 08/19/2015
Sep 23 21:10:52 E6230 kernel: [  355.091564]  0000000000000286 78feb98f63b8f309 ffff8800afc4f8b8 ffffffff8140b481
Sep 23 21:10:52 E6230 kernel: [  355.091567]  0000000000000003 0000000040000000 ffff8800afc4f8c8 ffffffffc0c4bd12
Sep 23 21:10:52 E6230 kernel: [  355.091569]  ffff8800afc4f9f0 ffffffffc0c4be7a ffffffffc0e6b6c8 6c6c61203a73667a
Sep 23 21:10:52 E6230 kernel: [  355.091571] Call Trace:
Sep 23 21:10:52 E6230 kernel: [  355.091577]  [<ffffffff8140b481>] dump_stack+0x63/0x82
Sep 23 21:10:52 E6230 kernel: [  355.091585]  [<ffffffffc0c4bd12>] spl_dumpstack+0x42/0x50 [spl]
Sep 23 21:10:52 E6230 kernel: [  355.091589]  [<ffffffffc0c4be7a>] vcmn_err+0x6a/0x100 [spl]
Sep 23 21:10:52 E6230 kernel: [  355.091617]  [<ffffffffc0cf8d9c>] ? arc_buf_eviction_needed+0x8c/0xd0 [zfs]
Sep 23 21:10:52 E6230 kernel: [  355.091633]  [<ffffffffc0cfe524>] ? dbuf_rele_and_unlock+0x2e4/0x3d0 [zfs]
Sep 23 21:10:52 E6230 kernel: [  355.091637]  [<ffffffffc0c483ab>] ? spl_kmem_cache_free+0x13b/0x1c0 [spl]
Sep 23 21:10:52 E6230 kernel: [  355.091663]  [<ffffffffc0d576cc>] zfs_panic_recover+0x6c/0x90 [zfs]
Sep 23 21:10:52 E6230 kernel: [  355.091686]  [<ffffffffc0d3f8be>] range_tree_add+0x2ce/0x2e0 [zfs]
Sep 23 21:10:52 E6230 kernel: [  355.091703]  [<ffffffffc0d08f81>] ? dmu_read+0x131/0x190 [zfs]
Sep 23 21:10:52 E6230 kernel: [  355.091727]  [<ffffffffc0d5951e>] space_map_load+0x37e/0x550 [zfs]
Sep 23 21:10:52 E6230 kernel: [  355.091749]  [<ffffffffc0d3c2a6>] metaslab_load+0x36/0xd0 [zfs]
Sep 23 21:10:52 E6230 kernel: [  355.091770]  [<ffffffffc0d3c4a9>] metaslab_activate+0x89/0xb0 [zfs]
Sep 23 21:10:52 E6230 kernel: [  355.091790]  [<ffffffffc0d3dd15>] metaslab_alloc+0x5d5/0xbd0 [zfs]
Sep 23 21:10:52 E6230 kernel: [  355.091794]  [<ffffffffc0c48493>] ? spl_kmem_cache_alloc+0x63/0x140 [spl]
Sep 23 21:10:52 E6230 kernel: [  355.091797]  [<ffffffff810b12c9>] ? ttwu_do_wakeup+0x19/0xf0
Sep 23 21:10:52 E6230 kernel: [  355.091799]  [<ffffffff810b143d>] ? ttwu_do_activate.constprop.87+0x5d/0x70
Sep 23 21:10:52 E6230 kernel: [  355.091826]  [<ffffffffc0da5b94>] zio_dva_allocate+0x94/0x400 [zfs]
Sep 23 21:10:52 E6230 kernel: [  355.091831]  [<ffffffffc0c483ab>] ? spl_kmem_cache_free+0x13b/0x1c0 [spl]
Sep 23 21:10:52 E6230 kernel: [  355.091857]  [<ffffffffc0da1918>] ? zio_buf_free+0x58/0x60 [zfs]
Sep 23 21:10:52 E6230 kernel: [  355.091861]  [<ffffffffc0c4916e>] ? taskq_member+0x4e/0x70 [spl]
Sep 23 21:10:52 E6230 kernel: [  355.091888]  [<ffffffffc0da352d>] zio_execute+0xcd/0x180 [zfs]
Sep 23 21:10:52 E6230 kernel: [  355.091892]  [<ffffffffc0c4a12b>] taskq_thread+0x22b/0x410 [spl]
Sep 23 21:10:52 E6230 kernel: [  355.091894]  [<ffffffff810b2330>] ? wake_up_q+0x70/0x70
Sep 23 21:10:52 E6230 kernel: [  355.091899]  [<ffffffffc0c49f00>] ? taskq_cancel_id+0x140/0x140 [spl]
Sep 23 21:10:52 E6230 kernel: [  355.091901]  [<ffffffff810a65c7>] kthread+0xe7/0x100
Sep 23 21:10:52 E6230 kernel: [  355.091904]  [<ffffffff8185ecc1>] ? __schedule+0x301/0x810
Sep 23 21:10:52 E6230 kernel: [  355.091906]  [<ffffffff810a64e0>] ? kthread_create_on_node+0x1e0/0x1e0
Sep 23 21:10:52 E6230 kernel: [  355.091909]  [<ffffffff81864025>] ret_from_fork+0x55/0x80
Sep 23 21:10:52 E6230 kernel: [  355.091910]  [<ffffffff810a64e0>] ? kthread_create_on_node+0x1e0/0x1e0
@behlendorf behlendorf added the Type: Question Issue for discussion label Sep 24, 2019
@behlendorf
Copy link
Contributor

@tlvu based on the stack trace it looks like your system unfortunately hit a known bug in the older ZFS release included with Ubuntu 16.04. While it looks like Ubuntu 16.04 never got the fix for this, Ubuntu 18.04 includes a significantly newer version of ZFS which did. If possible I'd encourage you to upgrade your system.

You may be able to resolve the issue by importing the pool under 18.04, mounting all of the ZFS filesystem and volumes, then exporting it. This will allow ZFS to replay all of its internal logs using the updated code, and leave the pool in a clean exported state. This should allow you to re-import it under 16.04. Sorry you got bit by this bug!

@tlvu
Copy link
Author

tlvu commented Sep 25, 2019

@behlendorf unfortunately zpool export using a newer ubuntu distro then zpool import again on my problematic host did not work, same errors again in syslog.

It feels like ZFS just plain do not want to work/run on my machine. It's not the problem of the data pool.

Anything else I can do to help debug this weird before giving up and just upgrade my system?

Can I have a link to this know issue? Maybe I discovered a new issue that is not that known issue?

@behlendorf
Copy link
Contributor

pool export using a newer ubuntu distro then zpool import again on my problematic host did not work.

You'll want to additionally make sure all of the datasets get mounted then export the pool. This ensure's the per-dataset logs get replayed which doesn't occur until after the import.

Anything else I can do to help debug

Thanks for offering, but I think we have everything we need. The issue I'm specifically thinking of is #6477 which was fixed in v0.7.2.

@tlvu
Copy link
Author

tlvu commented Sep 26, 2019

pool export using a newer ubuntu distro then zpool import again on my problematic host did not work.

You'll want to additionally make sure all of the datasets get mounted then export the pool. This ensure's the per-dataset logs get replayed which doesn't occur until after the import.

Confimed all the datasets of my pool were all mounted after zfs import on the new Ubuntu before zfs export.

Thanks for offering, but I think we have everything we need. The issue I'm specifically thinking of is #6477 which was fixed in v0.7.2.

I am not sure that's the same issue as mine.

I think mine has something to do with tainted kernel, see this line zavl: module license 'CDDL' taints kernel. in the syslog extract of the description of this issue.

@tlvu
Copy link
Author

tlvu commented Nov 19, 2019

Sorry to re-resurrect an old thread but I am a bit desperate here, really don't want to lose data.

So I upgraded to Ubuntu Bionic 18.04. Was able to import my pool. Used for a few weeks. Then out of the blue today I am again unable to import my pool on boot.

Tried to use a Ubuntu 19.04 livecd and still unable to import my pool.

Error in /var/log/syslog when I manually do sudo zpool import tank on ubuntu 19.04 livecd:

Nov 19 01:28:58 regolith kernel: [  126.504742] PANIC: zfs: allocating allocated segment(offset=108513914880 size=131072)
Nov 19 01:28:58 regolith kernel: [  126.504742]
Nov 19 01:28:58 regolith kernel: [  126.504746] Showing stack for process 6015
Nov 19 01:28:58 regolith kernel: [  126.504748] CPU: 1 PID: 6015 Comm: z_wr_iss Tainted: P           O      5.0.0-15-generic #16-Ubuntu
Nov 19 01:28:58 regolith kernel: [  126.504749] Hardware name: Dell Inc. Latitude E6230/0R6V5Y, BIOS A15 08/19/2015
Nov 19 01:28:58 regolith kernel: [  126.504749] Call Trace:
Nov 19 01:28:58 regolith kernel: [  126.504756]  dump_stack+0x63/0x8a
Nov 19 01:28:58 regolith kernel: [  126.504764]  spl_dumpstack.cold.2+0x20/0x25 [spl]
Nov 19 01:28:58 regolith kernel: [  126.504768]  vcmn_err.cold.3+0x60/0x94 [spl]
Nov 19 01:28:58 regolith kernel: [  126.504799]  ? dbuf_rele_and_unlock+0x37c/0x500 [zfs]
Nov 19 01:28:58 regolith kernel: [  126.504828]  ? zio_destroy+0xbc/0xc0 [zfs]
Nov 19 01:28:58 regolith kernel: [  126.504830]  ? _cond_resched+0x19/0x30
Nov 19 01:28:58 regolith kernel: [  126.504832]  ? kmem_cache_alloc+0x15f/0x1d0
Nov 19 01:28:58 regolith kernel: [  126.504836]  ? spl_kmem_cache_alloc+0x78/0x780 [spl]
Nov 19 01:28:58 regolith kernel: [  126.504864]  zfs_panic_recover+0x6f/0x90 [zfs]
Nov 19 01:28:58 regolith kernel: [  126.504891]  range_tree_add+0x29c/0x2f0 [zfs]
Nov 19 01:28:58 regolith kernel: [  126.504914]  ? dnode_rele+0x39/0x40 [zfs]
Nov 19 01:28:58 regolith kernel: [  126.504942]  space_map_load+0x2bb/0x4f0 [zfs]
Nov 19 01:28:58 regolith kernel: [  126.504968]  metaslab_load+0x36/0xf0 [zfs]
Nov 19 01:28:58 regolith kernel: [  126.504993]  metaslab_activate+0x93/0xc0 [zfs]
Nov 19 01:28:58 regolith kernel: [  126.504995]  ? _cond_resched+0x19/0x30
Nov 19 01:28:58 regolith kernel: [  126.505020]  metaslab_alloc+0x451/0x1050 [zfs]
Nov 19 01:28:58 regolith kernel: [  126.505048]  zio_dva_allocate+0xa0/0x560 [zfs]
Nov 19 01:28:58 regolith kernel: [  126.505050]  ? mutex_lock+0x12/0x30
Nov 19 01:28:58 regolith kernel: [  126.505054]  ? tsd_hash_search.isra.5+0x47/0xa0 [spl]
Nov 19 01:28:58 regolith kernel: [  126.505057]  ? tsd_get_by_thread+0x2e/0x40 [spl]
Nov 19 01:28:58 regolith kernel: [  126.505061]  ? taskq_member+0x18/0x30 [spl]
Nov 19 01:28:58 regolith kernel: [  126.505088]  zio_execute+0x99/0xf0 [zfs]
Nov 19 01:28:58 regolith kernel: [  126.505091]  taskq_thread+0x2ec/0x4d0 [spl]
Nov 19 01:28:58 regolith kernel: [  126.505093]  ? __switch_to_asm+0x40/0x70
Nov 19 01:28:58 regolith kernel: [  126.505095]  ? wake_up_q+0x80/0x80
Nov 19 01:28:58 regolith kernel: [  126.505122]  ? zio_taskq_member.isra.11.constprop.17+0x70/0x70 [zfs]
Nov 19 01:28:58 regolith kernel: [  126.505125]  kthread+0x120/0x140
Nov 19 01:28:58 regolith kernel: [  126.505128]  ? task_done+0xb0/0xb0 [spl]
Nov 19 01:28:58 regolith kernel: [  126.505130]  ? __kthread_parkme+0x70/0x70
Nov 19 01:28:58 regolith kernel: [  126.505131]  ret_from_fork+0x35/0x40

Is there a way to recover my pool? I have data in there I really would not want to lose.

$ apt list |grep zfs |grep installed
libzfs2linux/disco,now 0.7.12-1ubuntu5 amd64 [installed,automatic]
zfs-zed/disco,now 0.7.12-1ubuntu5 amd64 [installed,automatic]
zfsutils-linux/disco,now 0.7.12-1ubuntu5 amd64 [installed]

@stale
Copy link

stale bot commented Nov 18, 2020

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Nov 18, 2020
@tlvu
Copy link
Author

tlvu commented Dec 10, 2020

Just for the record, importing the pool in readonly mode (zpool import -o readonly=on tank) worked and let me recovered my data.

@stale stale bot removed the Status: Stale No recent activity for issue label Dec 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Question Issue for discussion
Projects
None yet
Development

No branches or pull requests

3 participants
@behlendorf @tlvu and others