Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VERIFY3(0 == zap_remove_int(zfsvfs->z_os, zfsvfs->z_unlinkedobj, zp->z_id, tx)) failed (0 == 2) #6812

Closed
youzhongyang opened this issue Nov 2, 2017 · 8 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@youzhongyang
Copy link
Contributor

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 17.04
Linux Kernel 4.13.0-16-generic
Architecture x86_64
ZFS Version 0.7.2-1
SPL Version 0.7.2-1

Describe the problem you're observing

spl_panic() causing zfs process to hang forever. 'reboot' also hung so it had to be power reset. This same issue also occurred on Centos 7.4.1708.

Describe how to reproduce the problem

I don't know how to simply reproduce the issue.

Include any warning/errors/backtraces from the system logs

root     11549  2036  0 08:55 ?        00:00:00  \_ /usr/sbin/CRON -f
root     11551 11549  0 08:55 ?        00:00:00      \_ /bin/sh -c env PATH=/usr/sbin:/usr/bin:/bin:/sbin /opt/admin/bin/clean_replication >> /var/log/clean_replication.log 2>&1
root     11561 11551  0 08:55 ?        00:00:00          \_ /usr/bin/perl -w /opt/admin/bin/clean_replication
root     11596 11561  0 08:55 ?        00:00:00              \_ sh -c (systemctl stop recursive_hsreplicate ; sleep 10 ; zfs destroy -r -R clusters/replica/Bmain_PP@Bmain_PP.742966.pass ; systemctl start recursive_hsreplicate) 2>&1
root     11597 11596  0 08:55 ?        00:00:00                  \_ sh -c (systemctl stop recursive_hsreplicate ; sleep 10 ; zfs destroy -r -R clusters/replica/Bmain_PP@Bmain_PP.742966.pass ; systemctl start recursive_hsreplicate) 2>&1
root     11624 11597  0 08:55 ?        00:00:00                      \_ zfs destroy -r -R clusters/replica/Bmain_PP Bmain_PP.742966.pass
root     12845 11624  0 08:55 ?        00:00:01                          \_ /bin/umount -t zfs /replica/Bmain.744194.Bmain_PP.742966.pass

root      6947     1  0 Oct18 ?        00:01:35 python /opt/graphite_mon_redis/filer_throttles/runcfsyscallmonitors.py -c /opt/graphite_mon_redis/filer_throttles/syscallmonitorlog.conf
snmp      6959     1  0 Oct18 ?        00:07:19 /usr/sbin/snmpd -Lsd -Lf /dev/null -u snmp -g snmp -I -smux mteTrigger mteTriggerConf -p /run/snmpd.pid
root      7753     1  1 Oct18 ?        06:07:23 python /opt/graphite_mon_redis/filer_throttles/runclusterthrottles.py -c /opt/graphite_mon_redis/filer_throttles/throttlelog.conf
root      7754     1  0 Oct18 ?        00:01:01 python /opt/graphite_mon_redis/filer_throttles/runclusterthrottleapi.py -c /opt/graphite_mon_redis/filer_throttles/apilog.conf
root      4783     1  0 Oct18 ?        00:01:00 /usr/bin/python /opt/batfsrest/bin/batfsrest.py
root     21327     1  0 Oct25 ?        00:00:04 /usr/bin/python -u /opt/batfsrest/bin/hsreplicate.py
root     21336 21327  0 Oct25 ?        00:02:09  \_ /usr/bin/python -u /opt/batfsrest/bin/repslave.py Bmain_PP
root     14322 21336  0 08:58 ?        00:00:00      \_ zfs recv -Fv clusters/replica/Bmain_PP
root     14323 21336  0 08:58 ?        00:00:00      \_ /opt/batfsrest/bin/lz4 -qq -d /vmgr/zlogs/Bmain_PP/Bmain_PP.zlog000078761_ts1509627495.lz4 -
root     16217     1  0 09:55 ?        00:00:00 /usr/bin/python -u /opt/batfsrest/bin/recursive_hsreplicate.py
root     16233 16217  0 09:55 ?        00:00:00  \_ /usr/bin/python -u /opt/batfsrest/bin/recursive_init_slave.py Bmain
root     16260 16233  0 09:55 ?        00:00:00      \_ /bin/csh /opt/batfsrest/scripts/recursive_hs_install/create_slave.csh Bmain.747831.Bmain_PP.746084.pass Bmain
root     16261 16260  0 09:55 ?        00:00:00          \_ zfs create clusters/vmgr/zlogs/Bmain.747831.Bmain_PP.746084.pass

root@batfs9940:~# cat /proc/11624/stack
[<ffffffff93289386>] do_wait+0x1c6/0x240
[<ffffffff9328a7b9>] kernel_wait4+0x89/0x130
[<ffffffff9328a8f5>] SYSC_wait4+0x95/0xa0
[<ffffffff9328a9ae>] SyS_wait4+0xe/0x10
[<ffffffff93b10bfb>] entry_SYSCALL_64_fastpath+0x1e/0xa9
[<ffffffffffffffff>] 0xffffffffffffffff

root@batfs9940:~# cat /proc/12845/stack
[<ffffffffc0c649ba>] spl_panic+0xfa/0x110 [spl]
[<ffffffffc15dbd04>] zfs_rmnode+0x304/0x340 [zfs]
[<ffffffffc1601700>] zfs_zinactive+0xd0/0xe0 [zfs]
[<ffffffffc15f9089>] zfs_inactive+0x79/0x210 [zfs]
[<ffffffffc1613473>] zpl_evict_inode+0x43/0x60 [zfs]
[<ffffffff93472817>] evict+0xc7/0x1a0
[<ffffffff93472929>] dispose_list+0x39/0x50
[<ffffffff93472a99>] evict_inodes+0x159/0x1a0
[<ffffffff93455274>] generic_shutdown_super+0x44/0x120
[<ffffffff934555d2>] kill_anon_super+0x12/0x20
[<ffffffffc161358a>] zpl_kill_sb+0x1a/0x20 [zfs]
[<ffffffff93455893>] deactivate_locked_super+0x43/0x70
[<ffffffff93455dae>] deactivate_super+0x4e/0x60
[<ffffffff93477a6f>] cleanup_mnt+0x3f/0x80
[<ffffffff93477af2>] __cleanup_mnt+0x12/0x20
[<ffffffff932a4e70>] task_work_run+0x80/0xa0
[<ffffffff932031c4>] exit_to_usermode_loop+0xc4/0xd0
[<ffffffff93203a19>] syscall_return_slowpath+0x59/0x60
[<ffffffff93b10c84>] entry_SYSCALL_64_fastpath+0xa7/0xa9
[<ffffffffffffffff>] 0xffffffffffffffff

root@batfs9940:~# cat /proc/14322/stack
[<ffffffffc0c6696e>] cv_wait_common+0x11e/0x140 [spl]
[<ffffffffc0c669a5>] __cv_wait+0x15/0x20 [spl]
[<ffffffffc15b4d9d>] txg_wait_synced+0xdd/0x130 [zfs]
[<ffffffffc158cea4>] dsl_sync_task+0x184/0x270 [zfs]
[<ffffffffc1567c1c>] dmu_recv_begin+0x16c/0x1c0 [zfs]
[<ffffffffc15e73c1>] zfs_ioc_recv_impl+0xe1/0x10e0 [zfs]
[<ffffffffc15e85b8>] zfs_ioc_recv+0x1f8/0x330 [zfs]
[<ffffffffc15e5b44>] zfsdev_ioctl+0x5d4/0x660 [zfs]
[<ffffffff93468d93>] do_vfs_ioctl+0xa3/0x610
[<ffffffff93469379>] SyS_ioctl+0x79/0x90
[<ffffffff93b10bfb>] entry_SYSCALL_64_fastpath+0x1e/0xa9
[<ffffffffffffffff>] 0xffffffffffffffff

root@batfs9940:~# cat /proc/16261/stack
[<ffffffffc0c6696e>] cv_wait_common+0x11e/0x140 [spl]
[<ffffffffc0c669a5>] __cv_wait+0x15/0x20 [spl]
[<ffffffffc15b4d9d>] txg_wait_synced+0xdd/0x130 [zfs]
[<ffffffffc158cea4>] dsl_sync_task+0x184/0x270 [zfs]
[<ffffffffc155ccd3>] dmu_objset_create+0x63/0x80 [zfs]
[<ffffffffc15e918d>] zfs_ioc_create+0x17d/0x3d0 [zfs]
[<ffffffffc15e577b>] zfsdev_ioctl+0x20b/0x660 [zfs]
[<ffffffff93468d93>] do_vfs_ioctl+0xa3/0x610
[<ffffffff93469379>] SyS_ioctl+0x79/0x90
[<ffffffff93b10bfb>] entry_SYSCALL_64_fastpath+0x1e/0xa9
[<ffffffffffffffff>] 0xffffffffffffffff

Nov  2 08:55:18 batfs9940 kernel: [1275912.384210] VERIFY3(0 == zap_remove_int(zfsvfs->z_os, zfsvfs->z_unlinkedobj, zp->z_id, tx)) failed (0 == 2)
Nov  2 08:55:18 batfs9940 kernel: [1275912.384251] PANIC at zfs_dir.c:724:zfs_rmnode()
Nov  2 08:55:18 batfs9940 kernel: [1275912.384268] Showing stack for process 12845
Nov  2 08:55:18 batfs9940 kernel: [1275912.384271] CPU: 11 PID: 12845 Comm: umount Tainted: P           OE   4.13.0-16-generic #19
Nov  2 08:55:18 batfs9940 kernel: [1275912.384271] Hardware name: Supermicro X8DTU-6+/X8DTU-6+, BIOS 2.1b       11/15/2011
Nov  2 08:55:18 batfs9940 kernel: [1275912.384272] Call Trace:
Nov  2 08:55:18 batfs9940 kernel: [1275912.384280]  dump_stack+0x63/0x8b
Nov  2 08:55:18 batfs9940 kernel: [1275912.384293]  spl_dumpstack+0x42/0x50 [spl]
Nov  2 08:55:18 batfs9940 kernel: [1275912.384296]  spl_panic+0xc8/0x110 [spl]
Nov  2 08:55:18 batfs9940 kernel: [1275912.384461]  ? dmu_buf_rele+0x36/0x40 [zfs]
Nov  2 08:55:18 batfs9940 kernel: [1275912.384498]  ? zap_unlockdir+0x3f/0x50 [zfs]
Nov  2 08:55:18 batfs9940 kernel: [1275912.384535]  ? zap_remove_norm+0x76/0xa0 [zfs]
Nov  2 08:55:18 batfs9940 kernel: [1275912.384572]  ? zap_remove+0x13/0x20 [zfs]
Nov  2 08:55:18 batfs9940 kernel: [1275912.384608]  zfs_rmnode+0x304/0x340 [zfs]
Nov  2 08:55:18 batfs9940 kernel: [1275912.384645]  ? zfs_znode_hold_exit+0xf9/0x130 [zfs]
Nov  2 08:55:18 batfs9940 kernel: [1275912.384681]  zfs_zinactive+0xd0/0xe0 [zfs]
Nov  2 08:55:18 batfs9940 kernel: [1275912.384718]  zfs_inactive+0x79/0x210 [zfs]
Nov  2 08:55:18 batfs9940 kernel: [1275912.384720]  ? truncate_pagecache+0x5a/0x70
Nov  2 08:55:18 batfs9940 kernel: [1275912.384757]  zpl_evict_inode+0x43/0x60 [zfs]
Nov  2 08:55:18 batfs9940 kernel: [1275912.384759]  evict+0xc7/0x1a0
Nov  2 08:55:18 batfs9940 kernel: [1275912.384761]  dispose_list+0x39/0x50
Nov  2 08:55:18 batfs9940 kernel: [1275912.384762]  evict_inodes+0x159/0x1a0
Nov  2 08:55:18 batfs9940 kernel: [1275912.384764]  generic_shutdown_super+0x44/0x120
Nov  2 08:55:18 batfs9940 kernel: [1275912.384765]  kill_anon_super+0x12/0x20
Nov  2 08:55:18 batfs9940 kernel: [1275912.384801]  zpl_kill_sb+0x1a/0x20 [zfs]
Nov  2 08:55:18 batfs9940 kernel: [1275912.384803]  deactivate_locked_super+0x43/0x70
Nov  2 08:55:18 batfs9940 kernel: [1275912.384804]  deactivate_super+0x4e/0x60
Nov  2 08:55:18 batfs9940 kernel: [1275912.384806]  cleanup_mnt+0x3f/0x80
Nov  2 08:55:18 batfs9940 kernel: [1275912.384808]  __cleanup_mnt+0x12/0x20
Nov  2 08:55:18 batfs9940 kernel: [1275912.384812]  task_work_run+0x80/0xa0
Nov  2 08:55:18 batfs9940 kernel: [1275912.384815]  exit_to_usermode_loop+0xc4/0xd0
Nov  2 08:55:18 batfs9940 kernel: [1275912.384816]  syscall_return_slowpath+0x59/0x60
Nov  2 08:55:18 batfs9940 kernel: [1275912.384818]  entry_SYSCALL_64_fastpath+0xa7/0xa9
Nov  2 08:55:18 batfs9940 kernel: [1275912.384819] RIP: 0033:0x7f55ac4adde7
Nov  2 08:55:18 batfs9940 kernel: [1275912.384820] RSP: 002b:00007ffc12d0e428 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
Nov  2 08:55:18 batfs9940 kernel: [1275912.384821] RAX: 0000000000000000 RBX: 00005596a7bf0030 RCX: 00007f55ac4adde7
Nov  2 08:55:18 batfs9940 kernel: [1275912.384822] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00005596a7bf3b80
Nov  2 08:55:18 batfs9940 kernel: [1275912.384823] RBP: 00005596a7bf3b80 R08: 00005596a7bf3b10 R09: 0000000000000013
Nov  2 08:55:18 batfs9940 kernel: [1275912.384823] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007f55ac9b5844
Nov  2 08:55:18 batfs9940 kernel: [1275912.384824] R13: 0000000000000000 R14: 00005596a7bf0210 R15: 00007ffc12d0e6c0
@array42-zz
Copy link

I also got that error, with a filesystem with enabled deduplication. A scrub was running. A rollback on one filesystem was also running. So there was some baseline load on the pool. Otherwise normal low disk operation in progress. Then the removal of one large file just took a lot of time. Noteworthy is, that that file I was removing also was in use the last time the kernel paniced. So maybe there is some sort of corruption with that specific file. All other processes finished, but the rm process is stuck and the stacktrace is below:

rm -v large-vm-image-with-lots-of-duplicated-blocks

 kernel:[1524182.764781] VERIFY3(0 == zap_remove_int(zfsvfs->z_os, zfsvfs->z_unlinkedobj, zp->z_id, tx)) failed (0 == 2)

 kernel:[1524182.764811] PANIC at zfs_dir.c:724:zfs_rmnode()

And dmesg:

[1524182.764781] VERIFY3(0 == zap_remove_int(zfsvfs->z_os, zfsvfs->z_unlinkedobj, zp->z_id, tx)) failed (0 == 2)
[1524182.764811] PANIC at zfs_dir.c:724:zfs_rmnode()
[1524182.764820] Showing stack for process 15254
[1524182.764823] CPU: 10 PID: 15254 Comm: z_iput Tainted: P          IO    4.13.13-5-pve #1
...
[1524182.764826] Call Trace:
[1524182.764836]  dump_stack+0x63/0x8b
[1524182.764853]  spl_dumpstack+0x42/0x50 [spl]
[1524182.764860]  spl_panic+0xc8/0x110 [spl]
[1524182.764962]  ? dmu_buf_rele+0x36/0x40 [zfs]
[1524182.765021]  ? zap_unlockdir+0x3f/0x50 [zfs]
[1524182.765079]  ? zap_remove_norm+0x76/0xa0 [zfs]
[1524182.765137]  ? zap_remove+0x13/0x20 [zfs]
[1524182.765195]  zfs_rmnode+0x304/0x340 [zfs]
[1524182.765252]  ? zfs_znode_hold_exit+0xf9/0x130 [zfs]
[1524182.765310]  zfs_zinactive+0xd0/0xe0 [zfs]
[1524182.765368]  zfs_inactive+0x79/0x210 [zfs]
[1524182.765372]  ? truncate_pagecache+0x5a/0x70
[1524182.765427]  zpl_evict_inode+0x43/0x60 [zfs]
[1524182.765431]  evict+0xc7/0x1a0
[1524182.765433]  iput+0x1c3/0x220
[1524182.765440]  taskq_thread+0x2a9/0x4d0 [spl]
[1524182.765443]  ? wake_up_q+0x80/0x80
[1524182.765447]  kthread+0x109/0x140
[1524182.765453]  ? taskq_thread_should_stop+0x70/0x70 [spl]
[1524182.765455]  ? kthread_create_on_node+0x70/0x70
[1524182.765456]  ? kthread_create_on_node+0x70/0x70
[1524182.765459]  ret_from_fork+0x1f/0x30

@KeithWard
Copy link

KeithWard commented Apr 29, 2019

We've just had an ocrrance of this issue - details below - to note we do not have deduplication enabled.

System information

Type Version/Name
Distribution Name CentOS
Distribution Version 6
Linux Kernel 4.4.143-100
Architecture x86_64
ZFS Version 0.7.1-1
SPL Version 0.7.1-1

Describe the problem you're observing

We've just had a crash similar to this, which was preceded by discovering a directory that we could not remove - it stated that the directory wasn't empty, despite ls stating that it was

Describe how to reproduce the problem

Directory shows as empty and you can't remove it
[root@server01 child]# ls -la baddir/
total 0
[root@server01 childl]# rm -rf baddir/

rm: cannot remove `baddir': Directory not empty

[root@server01 child]# rm -rf /mnt/parent/child/baddir/

rm: cannot remove `/mnt/parent/child/baddir': Directory not empty

[root@server01 childl]# mv baddir/ ..

mv: cannot remove `baddir': Directory not empty

When you attempt to then unmount the filesystem to destroy it:
[root@server01 parent]# zfs umount zpool1/parent/child

Message from syslogd@server02 at Apr 29 12:22:39 ...
 kernel:VERIFY3(0 == zap_remove_int(zfsvfs->z_os, zfsvfs->z_unlinkedobj, zp->z_id, tx)) failed (0 == 2)

Message from syslogd@server02 at Apr 29 12:22:39 ...
 kernel:PANIC at zfs_dir.c:724:zfs_rmnode()

Include any warning/errors/backtraces from the system logs

VERIFY3(0 == zap_remove_int(zfsvfs->z_os, zfsvfs->z_unlinkedobj, zp->z_id, tx)) failed (0 == 2)
PANIC at zfs_dir.c:724:zfs_rmnode()
Showing stack for process 6854
CPU: 1 PID: 6854 Comm: umount Tainted: P           O    4.4.143-100.el6.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
 0000000000000000 ffff880386d0b998 ffffffff812e41e3 0000000000000001
 000000013857afe9 ffff880386d0b9e8 ffffffffa0380610 ffff880386d0b9a8
 ffffffffa01c1994 ffff880386d0bb48 ffffffffa01c1bbb 0000000000000000
Call Trace:
 [<ffffffff812e41e3>] dump_stack+0x67/0x94
 [<ffffffffa01c1994>] spl_dumpstack+0x44/0x50 [spl]
 [<ffffffffa01c1bbb>] spl_panic+0xcb/0x110 [spl]
 [<ffffffffa0279d3e>] ? dmu_buf_rele+0xe/0x10 [zfs]
 [<ffffffffa02f9999>] ? zap_unlockdir+0x49/0x70 [zfs]
 [<ffffffffa02fbd32>] ? zap_remove_norm+0x72/0x90 [zfs]
 [<ffffffffa02f56b1>] ? zap_remove_int+0x61/0x90 [zfs]
 [<ffffffffa030619c>] zfs_rmnode+0x2ec/0x320 [zfs]
 [<ffffffffa032a2fc>] ? zfs_znode_hold_exit+0x10c/0x130 [zfs]
 [<ffffffffa032a536>] zfs_zinactive+0xd6/0xe0 [zfs]
 [<ffffffffa0322364>] zfs_inactive+0x74/0x1a0 [zfs]
 [<ffffffff8116d23e>] ? truncate_pagecache+0x5e/0x70
 [<ffffffffa033e88e>] zpl_evict_inode+0x4e/0x70 [zfs]
 [<ffffffff811eef6f>] evict+0xbf/0x1b0
 [<ffffffff811ef0ac>] dispose_list+0x4c/0x60
 [<ffffffff811efebd>] evict_inodes+0x17d/0x190
 [<ffffffff81207f1f>] ? __sync_filesystem+0x3f/0x50
 [<ffffffff811d8528>] generic_shutdown_super+0x48/0x100
 [<ffffffff811d8676>] kill_anon_super+0x16/0x30
 [<ffffffffa033e59e>] zpl_kill_sb+0x1e/0x30 [zfs]
 [<ffffffff811d7bb3>] deactivate_locked_super+0x53/0x90
 [<ffffffff811d8051>] deactivate_super+0x51/0x70
 [<ffffffff811f494f>] cleanup_mnt+0x3f/0x80
 [<ffffffff811f5232>] __cleanup_mnt+0x12/0x20
 [<ffffffff8108157f>] task_work_run+0x5f/0xa0
 [<ffffffff810022c6>] exit_to_usermode_loop+0xe6/0xf0
 [<ffffffff8100289a>] syscall_return_slowpath+0xba/0x180
 [<ffffffff811f67f7>] ? SyS_umount+0xb7/0x110
 [<ffffffff8163cc67>] int_ret_from_sys_call+0x25/0x93```

@KeithWard
Copy link

KeithWard commented Apr 29, 2019

Strange thing about this one - after rebooting, the directory suddenly has contents again:

[root@server01 child]# ls baddir/ -la

total 1835
drwxrwxr-x 11 nobody nobody       20 Apr 26 13:31 .
drwxr-xr-x  3 nobody nobody        3 Apr 29 12:19 ..
-rw-rw----  1 nobody nobody       56 Apr 29 12:03 auto.cnf
drwx------  2 nobody nobody        5 Apr 29 12:05 file2
<snip>

Prior to this issue we were diagnosing why after destroying a snapshot, the contents of the directory we had snapshotted had gone missing - which prior to this crash we believed was an application issue, however we now believe that ZFS got itself into such a state that it didn't what the actual state of the folder was.

@timemaster67
Copy link

timemaster67 commented Jul 28, 2019

I had this assertion failed on my system. it's a debian stretch, running 0.7.12-2+deb10u1~bpo9+1
Dedup not enabled. I executed a command to rollback to a previous snapshot of a file system, then executed a zfs umount -a not long after. feature@async_destroy was enabled, and the system was probably still rollbacking data. then I had this crash.

jui 19 21:48:12 freedom kernel: VERIFY3(0 == zap_remove_int(zfsvfs->z_os, zfsvfs->z_unlinkedobj, zp->z_id, tx)) failed (0 == 2)
jui 19 21:48:12 freedom kernel: PANIC at zfs_dir.c:724:zfs_rmnode()
jui 19 21:48:12 freedom kernel: Showing stack for process 5702
jui 19 21:48:12 freedom kernel: CPU: 16 PID: 5702 Comm: umount Tainted: P        W  O    4.9.0-9-amd64 #1 Debian 4.9.168-1+deb9u3
jui 19 21:48:12 freedom kernel: Hardware name: Supermicro X8DT3/X8DT3, BIOS 2.1     03/17/2012
jui 19 21:48:12 freedom kernel:  0000000000000000 ffffffff90f35504 ffffffffc0e5271a ffffc35d8995fd20
jui 19 21:48:12 freedom kernel:  ffffffffc0833f95 ffff9ffd00000000 ffff9ffd00000030 ffffc35d8995fd30
jui 19 21:48:12 freedom kernel:  ffffc35d8995fcd0 2833594649524556 70617a203d3d2030 5f65766f6d65725f
jui 19 21:48:12 freedom kernel: Call Trace:
jui 19 21:48:12 freedom kernel:  [<ffffffff90f35504>] ? dump_stack+0x5c/0x78
jui 19 21:48:12 freedom kernel:  [<ffffffffc0833f95>] ? spl_panic+0xc5/0x100 [spl]
jui 19 21:48:12 freedom kernel:  [<ffffffffc0d7ff52>] ? zap_remove_impl+0xb2/0x120 [zfs]
jui 19 21:48:12 freedom kernel:  [<ffffffffc0d814bb>] ? zap_remove_norm+0x7b/0xa0 [zfs]
jui 19 21:48:12 freedom kernel:  [<ffffffffc0d7c5f2>] ? zap_remove_int+0x52/0x70 [zfs]
jui 19 21:48:12 freedom kernel:  [<ffffffffc0d8ae03>] ? zfs_rmnode+0x2f3/0x330 [zfs]
jui 19 21:48:12 freedom kernel:  [<ffffffffc0dafad0>] ? zfs_zinactive+0xc0/0xd0 [zfs]
jui 19 21:48:12 freedom kernel:  [<ffffffffc0da7c97>] ? zfs_inactive+0x77/0x200 [zfs]
jui 19 21:48:12 freedom kernel:  [<ffffffff90db94bc>] ? unmap_mapping_range+0x7c/0x140
jui 19 21:48:12 freedom kernel:  [<ffffffffc0dc143c>] ? zpl_evict_inode+0x3c/0x50 [zfs]
jui 19 21:48:12 freedom kernel:  [<ffffffff90e28e49>] ? evict+0xb9/0x180
jui 19 21:48:12 freedom kernel:  [<ffffffff90e28f46>] ? dispose_list+0x36/0x50
jui 19 21:48:12 freedom kernel:  [<ffffffff90e290c4>] ? evict_inodes+0x164/0x190
jui 19 21:48:12 freedom kernel:  [<ffffffff90e0e24f>] ? generic_shutdown_super+0x3f/0xf0
jui 19 21:48:12 freedom kernel:  [<ffffffff90e0e57e>] ? kill_anon_super+0xe/0x20
jui 19 21:48:12 freedom kernel:  [<ffffffff90e0e6ea>] ? deactivate_locked_super+0x3a/0x70
jui 19 21:48:12 freedom kernel:  [<ffffffff90e2d6fb>] ? cleanup_mnt+0x3b/0x80
jui 19 21:48:12 freedom kernel:  [<ffffffff90c98e5f>] ? task_work_run+0x7f/0xa0
jui 19 21:48:12 freedom kernel:  [<ffffffff90c03754>] ? exit_to_usermode_loop+0xa4/0xb0
jui 19 21:48:12 freedom kernel:  [<ffffffff90c03bd9>] ? do_syscall_64+0xe9/0x100
jui 19 21:48:12 freedom kernel:  [<ffffffff9121a94e>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6

tried to reboot the system clean, but it was done for. lots of hung task, so I had to pretty much reset the computer

jui 19 21:55:21 freedom kernel: INFO: task umount:5702 blocked for more than 120 seconds.
jui 19 21:55:21 freedom kernel:       Tainted: P        W  O    4.9.0-9-amd64 #1 Debian 4.9.168-1+deb9u3
jui 19 21:55:21 freedom kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
jui 19 21:55:21 freedom kernel: umount          D    0  5702      1 0x00000004
jui 19 21:55:21 freedom kernel:  ffffa00113539000 0000000000000000 ffffa006ec964dc0 ffffa00267c98980
jui 19 21:55:21 freedom kernel:  ffffa00263c55000 ffffc35d8995fb88 ffffffff91215b89 ffffffff91619715
jui 19 21:55:21 freedom kernel:  0065766f6d65725f ffffa00267c98980 0000000000000286 ffffa006ec964dc0
jui 19 21:55:21 freedom kernel: Call Trace:
jui 19 21:55:21 freedom kernel:  [<ffffffff91215b89>] ? __schedule+0x239/0x6f0
jui 19 21:55:21 freedom kernel:  [<ffffffff91216072>] ? schedule+0x32/0x80
jui 19 21:55:21 freedom kernel:  [<ffffffffc0833fbb>] ? spl_panic+0xeb/0x100 [spl]
jui 19 21:55:21 freedom kernel:  [<ffffffffc0d7ff52>] ? zap_remove_impl+0xb2/0x120 [zfs]
jui 19 21:55:21 freedom kernel:  [<ffffffffc0d814bb>] ? zap_remove_norm+0x7b/0xa0 [zfs]
jui 19 21:55:21 freedom kernel:  [<ffffffffc0d7c5f2>] ? zap_remove_int+0x52/0x70 [zfs]
jui 19 21:55:21 freedom kernel:  [<ffffffffc0d8ae03>] ? zfs_rmnode+0x2f3/0x330 [zfs]
jui 19 21:55:21 freedom kernel:  [<ffffffffc0dafad0>] ? zfs_zinactive+0xc0/0xd0 [zfs]
jui 19 21:55:21 freedom kernel:  [<ffffffffc0da7c97>] ? zfs_inactive+0x77/0x200 [zfs]
jui 19 21:55:21 freedom kernel:  [<ffffffff90db94bc>] ? unmap_mapping_range+0x7c/0x140
jui 19 21:55:21 freedom kernel:  [<ffffffffc0dc143c>] ? zpl_evict_inode+0x3c/0x50 [zfs]
jui 19 21:55:21 freedom kernel:  [<ffffffff90e28e49>] ? evict+0xb9/0x180
jui 19 21:55:21 freedom kernel:  [<ffffffff90e28f46>] ? dispose_list+0x36/0x50
jui 19 21:55:21 freedom kernel:  [<ffffffff90e290c4>] ? evict_inodes+0x164/0x190
jui 19 21:55:21 freedom kernel:  [<ffffffff90e0e24f>] ? generic_shutdown_super+0x3f/0xf0
jui 19 21:55:21 freedom kernel:  [<ffffffff90e0e57e>] ? kill_anon_super+0xe/0x20
jui 19 21:55:21 freedom kernel:  [<ffffffff90e0e6ea>] ? deactivate_locked_super+0x3a/0x70
jui 19 21:55:21 freedom kernel:  [<ffffffff90e2d6fb>] ? cleanup_mnt+0x3b/0x80
jui 19 21:55:21 freedom kernel:  [<ffffffff90c98e5f>] ? task_work_run+0x7f/0xa0
jui 19 21:55:21 freedom kernel:  [<ffffffff90c03754>] ? exit_to_usermode_loop+0xa4/0xb0
jui 19 21:55:21 freedom kernel:  [<ffffffff90c03bd9>] ? do_syscall_64+0xe9/0x100
jui 19 21:55:21 freedom kernel:  [<ffffffff9121a94e>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6

@behlendorf behlendorf added the Type: Defect Incorrect behavior (e.g. crash, hang) label Jul 29, 2019
@AceSlash
Copy link

I just had the same issue on 3 different hosts.

First 2 hosts: Ubuntu 18.04.3 with ZoL 0.7.5-1ubuntu16.6
Third host: Debian 9 with ZoL 0.7.12-1~bpo9+1

In all cases we made a rollback of a dataset and the process was stock on zfs umount.

The error was:

Dec 17 16:12:36 srv-217 kernel: VERIFY3(0 == zap_remove_int(zfsvfs->z_os, zfsvfs->z_unlinkedobj, zp->z_id, tx)) failed (0 == 2)
Dec 17 16:12:36 srv-217 kernel: PANIC at zfs_dir.c:724:zfs_rmnode()
Dec 17 16:12:36 srv-217 kernel: Showing stack for process 99999
Dec 17 16:12:36 srv-217 kernel: CPU: 19 PID: 99999 Comm: umount Tainted: P           OE    4.15.0-47-generic #50-Ubuntu
Dec 17 16:12:36 srv-217 kernel: Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.7.0 05/23/2018
Dec 17 16:12:36 srv-217 kernel: Call Trace:
Dec 17 16:12:36 srv-217 kernel:  dump_stack+0x63/0x8b
Dec 17 16:12:36 srv-217 kernel:  spl_dumpstack+0x42/0x50 [spl]
Dec 17 16:12:36 srv-217 kernel:  spl_panic+0xc8/0x110 [spl]
Dec 17 16:12:36 srv-217 kernel:  ? dmu_buf_rele+0x36/0x40 [zfs]
Dec 17 16:12:36 srv-217 kernel:  ? zap_unlockdir+0x3f/0x50 [zfs]
Dec 17 16:12:36 srv-217 kernel:  ? zap_remove_norm+0x76/0xa0 [zfs]
Dec 17 16:12:36 srv-217 kernel:  ? zap_remove+0x13/0x20 [zfs]
Dec 17 16:12:36 srv-217 kernel:  zfs_rmnode+0x2e5/0x330 [zfs]
Dec 17 16:12:36 srv-217 kernel:  ? zfs_znode_hold_exit+0xf9/0x130 [zfs]
Dec 17 16:12:36 srv-217 kernel:  zfs_zinactive+0xd0/0xe0 [zfs]
Dec 17 16:12:36 srv-217 kernel:  zfs_inactive+0x7e/0x200 [zfs]
Dec 17 16:12:36 srv-217 kernel:  ? truncate_pagecache+0x5a/0x70
Dec 17 16:12:36 srv-217 kernel:  zpl_evict_inode+0x43/0x60 [zfs]
Dec 17 16:12:36 srv-217 kernel:  evict+0xca/0x1a0
Dec 17 16:12:36 srv-217 kernel:  dispose_list+0x39/0x50
Dec 17 16:12:36 srv-217 kernel:  evict_inodes+0x167/0x190
Dec 17 16:12:36 srv-217 kernel:  generic_shutdown_super+0x44/0x120
Dec 17 16:12:36 srv-217 kernel:  kill_anon_super+0x12/0x20
Dec 17 16:12:36 srv-217 kernel:  zpl_kill_sb+0x1a/0x20 [zfs]
Dec 17 16:12:36 srv-217 kernel:  deactivate_locked_super+0x48/0x80
Dec 17 16:12:36 srv-217 kernel:  deactivate_super+0x40/0x60
Dec 17 16:12:36 srv-217 kernel:  cleanup_mnt+0x3f/0x80
Dec 17 16:12:36 srv-217 kernel:  __cleanup_mnt+0x12/0x20
Dec 17 16:12:36 srv-217 kernel:  task_work_run+0x9d/0xc0
Dec 17 16:12:36 srv-217 kernel:  exit_to_usermode_loop+0xc0/0xd0
Dec 17 16:12:36 srv-217 kernel:  do_syscall_64+0x115/0x130
Dec 17 16:12:36 srv-217 kernel:  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Dec 17 16:12:36 srv-217 kernel: RIP: 0033:0x7f09b7140487
Dec 17 16:12:36 srv-217 kernel: RSP: 002b:00007fff0e29b118 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
Dec 17 16:12:36 srv-217 kernel: RAX: 0000000000000000 RBX: 00000000008d95b0 RCX: 00007f09b7140487
Dec 17 16:12:36 srv-217 kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00000000008d9790
Dec 17 16:12:36 srv-217 kernel: RBP: 00000000008d9790 R08: 0000000000000000 R09: 0000000000000013
Dec 17 16:12:36 srv-217 kernel: R10: 00000000000006b2 R11: 0000000000000246 R12: 00007f09b764983c
Dec 17 16:12:36 srv-217 kernel: R13: 0000000000000000 R14: 00000000008e9210 R15: 00007fff0e29b3b0

Servers were completely stuck and nothing could be done to unstuck them.
I did make a stack on one of the server before I had to do an emergency reboot:

root@srv-202:~ # ps -aux|grep 95131
root      95131  0.0  0.0  22080  1184 ?        D    16:10   0:00 /bin/umount -t zfs /var/snap/lxd/common/lxd/storage-pools/default/containers/poseidon

root@srv-202:~ # cat /proc/95131/wchan
zfs_rmnode

root@srv-202:~ # cat /proc/95131/stack
[<ffffffffc079ffbb>] spl_panic+0xeb/0x100 [spl]
[<ffffffffc0d57f52>] zap_remove_impl+0xb2/0x120 [zfs]
[<ffffffffc0d594bb>] zap_remove_norm+0x7b/0xa0 [zfs]
[<ffffffffc0d545f2>] zap_remove_int+0x52/0x70 [zfs]
[<ffffffffc0d62e03>] zfs_rmnode+0x2f3/0x330 [zfs]
[<ffffffffc0d87ad0>] zfs_zinactive+0xc0/0xd0 [zfs]
[<ffffffffc0d7fc97>] zfs_inactive+0x77/0x200 [zfs]
[<ffffffff9b3b8eac>] unmap_mapping_range+0x7c/0x140
[<ffffffffc0d9943c>] zpl_evict_inode+0x3c/0x50 [zfs]
[<ffffffff9b4284a9>] evict+0xb9/0x180
[<ffffffff9b4285a6>] dispose_list+0x36/0x50
[<ffffffff9b428724>] evict_inodes+0x164/0x190
[<ffffffff9b40d8ef>] generic_shutdown_super+0x3f/0xf0
[<ffffffff9b40dc0e>] kill_anon_super+0xe/0x20
[<ffffffff9b40dd7a>] deactivate_locked_super+0x3a/0x70
[<ffffffff9b42cd5b>] cleanup_mnt+0x3b/0x80
[<ffffffff9b298acf>] task_work_run+0x7f/0xa0
[<ffffffff9b203754>] exit_to_usermode_loop+0xa4/0xb0
[<ffffffff9b203bcd>] do_syscall_64+0xdd/0xf0
[<ffffffff9b81924e>] entry_SYSCALL_64_after_swapgs+0x58/0xc6
[<ffffffffffffffff>] 0xffffffffffffffff

root@srv-202:~ # cat /proc/95131/syscall
166 0xfa2790 0x0 0x1 0x6b2 0x0 0x13 0x7ffe7f26d1d8 0x7ff3a5c61487

root@srv-202:~ # cat /proc/95131/schedstat
7838547 30094 9

root@srv-202:~ # cat /proc/95131/sched
umount (95131, #threads: 1)
-------------------------------------------------------------------
se.exec_start                                :   25156628145.562495
se.vruntime                                  :   18518630662.134760
se.sum_exec_runtime                          :             7.838547
se.nr_migrations                             :                    1
nr_switches                                  :                    9
nr_voluntary_switches                        :                    6
nr_involuntary_switches                      :                    3
se.load.weight                               :              1048576
se.avg.load_sum                              :             21943687
se.avg.util_sum                              :             13696043
se.avg.load_avg                              :                  439
se.avg.util_avg                              :                  266
se.avg.last_update_time                      :    25156628145562495
policy                                       :                    0
prio                                         :                  120
clock-delta                                  :                   15
mm->numa_scan_seq                            :                    0
numa_pages_migrated                          :                    0
numa_preferred_nid                           :                   -1
total_numa_faults                            :                    0
current_node=0, numa_group_id=0
numa_faults node=0 task_private=0 task_shared=0 group_private=0 group_shared=0
numa_faults node=1 task_private=0 task_shared=0 group_private=0 group_shared=0

root@srv-202:~ # cat /proc/95131/status
Name:   umount   
Umask:  0022
State:  D (disk sleep)
Tgid:   95131
Ngid:   0
Pid:    95131
PPid:   95130
TracerPid:      0
Uid:    0       0       0       0
Gid:    0       0       0       0
FDSize: 64
Groups:
NStgid: 95131
NSpid:  95131
NSpgid: 9309
NSsid:  9309
VmPeak:    22080 kB
VmSize:    22080 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:      1184 kB
VmRSS:      1184 kB
RssAnon:             160 kB
RssFile:            1024 kB
RssShmem:              0 kB
VmData:      260 kB
VmStk:       132 kB
VmExe:        24 kB
VmLib:      3180 kB
VmPTE:        64 kB
VmPMD:        12 kB
VmSwap:        0 kB
HugetlbPages:          0 kB
Threads:        1
SigQ:   4/515489 
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: 0000003fffffffff
CapEff: 0000003fffffffff
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
Seccomp:        0
Speculation_Store_Bypass:       vulnerable
Cpus_allowed:   ffff,ffffffff
Cpus_allowed_list:      0-47
Mems_allowed:   00000000,00000003
Mems_allowed_list:      0-1
voluntary_ctxt_switches:        6
nonvoluntary_ctxt_switches:     3

Since we somehow have a way to reproduce this, we will try to track this down on a test system. What can we do more to diagnose the issue?

@behlendorf
Copy link
Contributor

In all cases we made a rollback of a dataset and the process was stock on zfs umount.

@AceSlash I believe I understand how this issue can occur. The critical thing here is that an online rollback was performed which makes this corner case possible. I've gone ahead and opened PR #9739 with a proposed fix and explanation.

What can we do more to diagnose the issue?

Unfortunately, I wasn't able to reproduce this issue locally on master with some quick manual testing. If you're able to semi-reliably trigger this it would be very helpful if you could test the proposed fix on your test system.

In order to rigorously test it, you'll need to do a few things.

  1. Manually apply the patch and use ./configure --enable-debug when building ZFS to enable additional internal correctness checks.
  2. Verify the issue can no longer be reproduced with the patched version.
  3. Export the pool and run zdb <pool>, we want to verify that no space was leaked.

@AceSlash
Copy link

@behlendorf Unfortunately we tried to reproduce it on a test system (identical to the prod system) but it didn't trigger, all rollback worked normaly.

The thing is that the test system has a very low load so if that is load related, it may never trigger on it.
All systems were it triggered were running LXD, which uses cgroup to mount zfs filesystem, do you think it could be related?

If we can reproduce it, we'll go ahead and try the patch, no problem.

@behlendorf
Copy link
Contributor

I don't see how it could be directly caused by using LXD. However, it could be timing related which is why it's tricky to reproduce. The initial reports of this issue are that the failure occurred when removing a large file and performing an online rollback at the same time. Is this scenario something which is possible in your production environment?

bgly pushed a commit to buurst/zfs that referenced this issue Jun 25, 2020
If a has rollback has occurred while a file is open and unlinked.
Then when the file is closed post rollback it will not exist in the
rolled back version of the unlinked object.  Therefore, the call to
zap_remove_int() may correctly return ENOENT and should be allowed.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#6812 
Closes openzfs#9739
jsai20 pushed a commit to jsai20/zfs that referenced this issue Mar 30, 2021
If a has rollback has occurred while a file is open and unlinked.
Then when the file is closed post rollback it will not exist in the
rolled back version of the unlinked object.  Therefore, the call to
zap_remove_int() may correctly return ENOENT and should be allowed.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#6812 
Closes openzfs#9739
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

6 participants