Fix lockdep recursive locking false positive in dbuf_destroy #8984

jdike · 2019-07-02T19:23:17Z

lockdep reports a possible recursive lock in dbuf_destroy.

It is true that dbuf_destroy is acquiring the dn_dbufs_mtx
on one dnode while holding it on another dnode. However,
it is impossible for these to be the same dnode because,
among other things,dbuf_destroy checks MUTEX_HELD before
acquiring the mutex.

This fix defines a class NESTED_SINGLE == 1 and changes
that lock to call mutex_enter_nested with a subclass of
NESTED_SINGLE.

In order to make the userspace code compile,
include/sys/zfs_context.h now defines mutex_enter_nested and
NESTED_SINGLE.

This is the lockdep report:

[ 122.950921] ============================================
[ 122.950921] WARNING: possible recursive locking detected
[ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 #1 Tainted: G O
[ 122.950921] --------------------------------------------
[ 122.950921] dbu_evict/1457 is trying to acquire lock:
[ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs]
[ 122.950921]
but task is already holding lock:
[ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs]
[ 122.950921]
other info that might help us debug this:
[ 122.950921] Possible unsafe locking scenario:

[ 122.950921] CPU0
[ 122.950921] ----
[ 122.950921] lock(&dn->dn_dbufs_mtx);
[ 122.950921] lock(&dn->dn_dbufs_mtx);
[ 122.950921]
*** DEADLOCK ***

[ 122.950921] May be due to missing lock nesting notation

[ 122.950921] 1 lock held by dbu_evict/1457:
[ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs]
[ 122.950921]
stack backtrace:
[ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 #1
[ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009
[ 122.950921] Call Trace:
[ 122.950921] dump_stack+0x91/0xeb
[ 122.950921] __lock_acquire+0x2ca7/0x4f10
[ 122.950921] ? debug_show_all_locks+0x2d0/0x2d0
[ 122.950921] ? debug_show_all_locks+0x2d0/0x2d0
[ 122.950921] ? sched_clock_local+0xd8/0x130
[ 122.950921] ? sched_clock_cpu+0x133/0x170
[ 122.950921] ? arc_buf_destroy+0x21f/0x440 [zfs]
[ 122.950921] ? __lock_acquire+0xe3b/0x4f10
[ 122.950921] ? lock_acquire+0x153/0x330
[ 122.950921] lock_acquire+0x153/0x330
[ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs]
[ 122.950921] ? dbuf_destroy+0x1e2/0xdb0 [zfs]
[ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs]
[ 122.950921] __mutex_lock+0xef/0x1380
[ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs]
[ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs]
[ 122.950921] ? __mutex_add_waiter+0x160/0x160
[ 122.950921] ? sched_clock_cpu+0x133/0x170
[ 122.950921] ? __mutex_unlock_slowpath+0xf3/0x660
[ 122.950921] ? check_flags.part.23+0x480/0x480
[ 122.950921] ? zrl_add_impl+0x7e/0x380 [zfs]
[ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs]
[ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs]
[ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs]
[ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs]
[ 122.950921] ? dbuf_evict_one+0x3d0/0x3d0 [zfs]
[ 122.950921] ? dbuf_destroy+0x8c4/0xdb0 [zfs]
[ 122.950921] ? rcu_read_lock_sched_held+0xeb/0x120
[ 122.950921] ? kmem_cache_free+0x1b5/0x1f0
[ 122.950921] ? arc_space_return+0x7e/0x180 [zfs]
[ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs]
[ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs]
[ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs]
[ 122.950921] taskq_thread+0x979/0x1480 [spl]
[ 122.950921] ? taskq_thread_should_stop+0x200/0x200 [spl]
[ 122.950921] ? debug_show_all_locks+0x2d0/0x2d0
[ 122.950921] ? wake_up_q+0xf0/0xf0
[ 122.950921] ? sched_clock_local+0xd8/0x130
[ 122.950921] ? dsl_dataset_check_quota+0x8a0/0x8a0 [zfs]
[ 122.950921] ? __kthread_parkme+0xad/0x180
[ 122.950921] ? taskq_thread_should_stop+0x200/0x200 [spl]
[ 122.950921] kthread+0x2e7/0x3e0
[ 122.950921] ? kthread_park+0x120/0x120
[ 122.950921] ret_from_fork+0x27/0x50

How Has This Been Tested?

The internal Akamai test suite which elicited this warning was rerun with no repeats and no regressions.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Performance enhancement (non-breaking change which improves efficiency)
Code cleanup (non-breaking change which makes code smaller or more readable)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (a change to man pages or other documentation)

Checklist:

[ x] My code follows the ZFS on Linux code style requirements.
I have updated the documentation accordingly.
I have read the contributing document.
I have added tests to cover my changes.
[ x] All new and existing tests passed.
[ x] All commit messages are properly formatted and contain Signed-off-by.
Signed-off-by: Jeff Dike jdike@akamai.com

behlendorf

Looks good. Was this the only issue reported by lockdep? I ask because I though we still had a couple other false positives which had not yet been resolved. But if that's not the case, then we shoudl close them. See:

https://github.com/zfsonlinux/zfs/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+%22lockdep%3A%22+

jdike · 2019-07-03T20:41:32Z

Hi Brian, No, it's not the only issue. It is the simpler of the two that I have. I'm having trouble building my fix for the second. It's simple enough, but I haven't figured out how to get atomic_t references to build in the non-SPL part of ZFS, Here's what I'm trying to do (this is RFC material, so I don't care if Exchange mangles it...): diff --git a/include/sys/bpobj.h b/include/sys/bpobj.h index d425e239f..ababcd132 100644 --- a/include/sys/bpobj.h +++ b/include/sys/bpobj.h @@ -55,6 +55,7 @@ typedef struct bpobj_phys { typedef struct bpobj { kmutex_t bpo_lock; + int bpo_id; objset_t *bpo_os; uint64_t bpo_object; int bpo_epb; diff --git a/module/zfs/bpobj.c b/module/zfs/bpobj.c index 633801956..a4d00f510 100644 --- a/module/zfs/bpobj.c +++ b/module/zfs/bpobj.c @@ -143,6 +143,8 @@ out: VERIFY3U(0, ==, dmu_object_free(os, obj, tx)); } +static atomic_t next_bpo_id = ATOMIC_INIT(0); + int bpobj_open(bpobj_t *bpo, objset_t *os, uint64_t object) { @@ -172,6 +174,7 @@ bpobj_open(bpobj_t *bpo, objset_t *os, uint64_t object) bpo->bpo_havecomp = (doi.doi_bonus_size > BPOBJ_SIZE_V0); bpo->bpo_havesubobj = (doi.doi_bonus_size > BPOBJ_SIZE_V1); bpo->bpo_phys = bpo->bpo_dbuf->db_data; + bpo->bpo_id = atomic_inc_return(&next_bpo_id); return (0); } @@ -818,7 +821,7 @@ int bpobj_space(bpobj_t *bpo, uint64_t *usedp, uint64_t *compp, uint64_t *uncompp) { ASSERT(bpobj_is_open(bpo)); - mutex_enter(&bpo->bpo_lock); + mutex_enter_nested(&bpo->bpo_lock, bpo->bpo_id); *usedp = bpo->bpo_phys->bpo_bytes; if (bpo->bpo_havecomp) { The situation (which I have described in much more detail in my local commit) is that lockdep is seeing possible circular locking in a path which involves two different txg_sync_threads. The reasoning that lets me conclude that this is a false positive is that those two txg_sync_threads must be referring to different lock bpo_lock instances. The patch above should be pretty self-explanatory. My problem is that I can't get this to build in ZFS git. Including <sys/atomic.h> doesn't work in bpobj.c, even though it does in other files, such as zfs_znode.c. The include is inside #ifdef KERNEL there, and I don't see how that matters because the Makefile treats bpobj.c and zfs_znode.c the same. Any clues gratefully accepted... Jeff On Wed, 2019-07-03 at 09:43 -0700, Brian Behlendorf wrote: @behlendorf approved this pull request. Looks good. Was this the only issue reported by lockdep? I ask because I though we still had a couple other false positives which had not yet been resolved. But if that's not the case, then we shoudl close them. See: https://github.com/zfsonlinux/zfs/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+%22lockdep%3A%22+<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zfsonlinux_zfs_issues-3Futf8-3D-25E2-259C-2593-26q-3Dis-253Aissue-2Bis-253Aopen-2B-2522lockdep-253A-2522-2B&d=DwMCaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=qoeLHezHQpLd89A5Cxgm7Q&m=pUuFCi91MOa7iVeNGhfwDM2HfR7ObRO1GHbkliysLrM&s=Kzsezaxq-BApxBHepGTURtbWYJv32_oHX8kklhInRjQ&e=> — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zfsonlinux_zfs_pull_8984-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DAMP55AR4OSYRKXVA2TFX55DP5TJKLA5CNFSM4H47B52KYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOB5NN4XQ-23pullrequestreview-2D257613406&d=DwMCaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=qoeLHezHQpLd89A5Cxgm7Q&m=pUuFCi91MOa7iVeNGhfwDM2HfR7ObRO1GHbkliysLrM&s=KRlmeNTrs9WCjuGOxfoAcpI4pldHMfasTNb0ahbeY2s&e=>, or mute the thread<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMP55AXDLXCVJIZIXP3AWRTP5TJKLANCNFSM4H47B52A&d=DwMCaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=qoeLHezHQpLd89A5Cxgm7Q&m=pUuFCi91MOa7iVeNGhfwDM2HfR7ObRO1GHbkliysLrM&s=jhHF5vsOePu8wFwZOzC9k2qWLtDUZfXM3_QT37xrIp0&e=>.

behlendorf · 2019-07-03T22:49:33Z

Would you mind posting the lockdep analysis somewhere, I'd like to take a look.

behlendorf · 2019-07-05T23:59:43Z

@jdike you should also be able to silence a false positive by using the MUTEX_NOLOCKDEP type when creating the mutex. Though that's clearly not as nice as creating the proper subclasses.

jdike · 2019-07-08T02:30:27Z

Hi Brian, My local commit is below... Jeff commit d27d578 (HEAD -> locking) Author: Jeff Dike <jdike@akamai.com> Date: Mon Jul 1 11:16:55 2019 -0400 Quiet lockdep circular lock warning This is a false positive. The full lockdep warning is below. The jist of it is: the existing dependency chain (in reverse order) is: -> #3 (&dp->dp_lock){+.+.}: dsl_pool_dirty_space+0x70/0x2d0 [zfs] dbuf_dirty+0x778/0x31d0 [zfs] -> #2 (&db->db_mtx){+.+.}: dmu_buf_will_dirty+0x6b/0x6c0 [zfs] bpobj_iterate_impl+0xbe6/0x1410 [zfs] spa_sync+0x1756/0x29b0 [zfs] txg_sync_thread+0x6d6/0x1370 [zfs] -> #1 (&bpo->bpo_lock){+.+.}: bpobj_space+0x63/0x470 [zfs] dsl_scan_active+0x340/0x3d0 [zfs] txg_sync_thread+0x3f2/0x1370 [zfs] -> #0 (&tx->tx_sync_lock){+.+.}: __mutex_lock+0xef/0x1380 txg_kick+0x61/0x420 [zfs] dsl_pool_need_dirty_delay+0x1c7/0x3f0 [zfs] Different instances of txg_sync_thread manage different dsl_pools, which contain different bpo_lock instances. Stacks #1 and #2 therefore can't be referring to the same bpo_lock instance, so this can't be an actual loop. The patch adds an ID field (bpo_id) to bpo_lock which is unique to each instance. This bpo_id is passed as the class to the mutex_enter_nested call in bpobj_space. This effectively makes this locking call invisible to lockdep. [ 125.297679] ====================================================== [ 125.301361] hardirqs last enabled at (28216159): [<ffffffff885cef12>] handle_mm_fault+0x582/0x7d0 [ 125.302863] WARNING: possible circular locking dependency detected [ 125.302865] 4.19.29-4.19.0-debug-99d9c44b25c08f51 #1 Tainted: G O [ 125.309056] hardirqs last disabled at (28216160): [<ffffffff88530f22>] get_page_from_freelist+0x3b2/0x31a0 [ 125.318009] ---------------------------------------------------- -- [ 125.318011] mdtagent/13103 is trying to acquire lock: [ 125.324203] softirqs last enabled at (28215144): [<ffffffff89c00631>] __do_softirq+0x631/0x868 [ 125.331591] 00000000679a2c2b ( [ 125.341249] softirqs last disabled at (28215139): [<ffffffff8813ef80>] irq_exit+0x150/0x170 [ 125.347425] &tx->tx_sync_lock){+.+.}, at: txg_kick+0x61/0x420 [zfs] [ 125.378920] but task is already holding lock: [ 125.384762] 00000000ec7e955e (&dp->dp_lock){+.+.}, at: dsl_pool_need_dirty_delay+0x8b/0x3f0 [zfs] [ 125.393679] which lock already depends on the new lock. [ 125.401861] the existing dependency chain (in reverse order) is: [ 125.409347] -> #3 (&dp->dp_lock){+.+.}: [ 125.414710] dsl_pool_dirty_space+0x70/0x2d0 [zfs] [ 125.420068] dbuf_dirty+0x778/0x31d0 [zfs] [ 125.424723] dmu_write_uio_dnode+0xfe/0x300 [zfs] [ 125.429990] dmu_write_uio_dbuf+0xa0/0x100 [zfs] [ 125.435174] zfs_write+0x18bd/0x1f70 [zfs] [ 125.439837] zpl_write_common_iovec+0x15e/0x380 [zfs] [ 125.445452] zpl_iter_write+0x1c6/0x2a0 [zfs] [ 125.450334] __vfs_write+0x3a2/0x5b0 [ 125.454440] vfs_write+0x15d/0x460 [ 125.458365] ksys_write+0xb1/0x170 [ 125.462295] do_syscall_64+0x9b/0x400 [ 125.466491] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 125.472069] -> #2 (&db->db_mtx){+.+.}: [ 125.477337] dmu_buf_will_dirty+0x6b/0x6c0 [zfs] [ 125.482509] bpobj_iterate_impl+0xbe6/0x1410 [zfs] [ 125.487863] spa_sync+0x1756/0x29b0 [zfs] [ 125.492442] txg_sync_thread+0x6d6/0x1370 [zfs] [ 125.497499] thread_generic_wrapper+0x1ff/0x2a0 [spl] [ 125.503080] kthread+0x2e7/0x3e0 [ 125.506839] ret_from_fork+0x24/0x50 [ 125.510937] -> #1 (&bpo->bpo_lock){+.+.}: [ 125.516464] bpobj_space+0x63/0x470 [zfs] [ 125.521038] dsl_scan_active+0x340/0x3d0 [zfs] [ 125.526049] txg_sync_thread+0x3f2/0x1370 [zfs] [ 125.531106] thread_generic_wrapper+0x1ff/0x2a0 [spl] [ 125.536684] kthread+0x2e7/0x3e0 [ 125.540439] ret_from_fork+0x24/0x50 [ 125.544535] -> #0 (&tx->tx_sync_lock){+.+.}: [ 125.550295] __mutex_lock+0xef/0x1380 [ 125.554529] txg_kick+0x61/0x420 [zfs] [ 125.558841] dsl_pool_need_dirty_delay+0x1c7/0x3f0 [zfs] [ 125.564715] dmu_tx_assign+0x13b/0xe40 [zfs] [ 125.569557] zfs_write+0xb5e/0x1f70 [zfs] [ 125.574130] zpl_write_common_iovec+0x15e/0x380 [zfs] [ 125.579746] zpl_iter_write+0x1c6/0x2a0 [zfs] [ 125.584628] __vfs_write+0x3a2/0x5b0 [ 125.588726] vfs_write+0x15d/0x460 [ 125.592654] ksys_write+0xb1/0x170 [ 125.596580] do_int80_syscall_32+0xf1/0x460 [ 125.601287] entry_INT80_compat+0x8a/0xa0 [ 125.605817] other info that might help us debug this: [ 125.613825] Chain exists of: &tx->tx_sync_lock --> &db->db_mtx --> &dp->dp_lock [ 125.624174] Possible unsafe locking scenario: [ 125.630099] CPU0 CPU1 [ 125.634633] ---- ---- [ 125.639166] lock(&dp->dp_lock); [ 125.642485] lock(&db->db_mtx); [ 125.648232] lock(&dp->dp_lock); [ 125.654071] lock(&tx->tx_sync_lock); [ 125.657823] *** DEADLOCK *** [ 125.663743] 3 locks held by mdtagent/13103: [ 125.667928] #0: 0000000016b80a9c (&f->f_pos_lock){+.+.}, at: __fdget_pos+0xb1/0xe0 [ 125.675591] #1: 00000000d0227133 (sb_writers#12){.+.+}, at: vfs_write+0x332/0x460 [ 125.683167] #2: 00000000ec7e955e (&dp->dp_lock){+.+.}, at: dsl_pool_need_dirty_delay+0x8b/0x3f0 [zfs] [ 125.692517] stack backtrace: [ 125.696878] CPU: 28 PID: 13103 Comm: mdtagent Tainted: G O 4.19.29-4.19.0-debug-99d9c44b25c08f51 #1 [ 125.707226] Hardware name: Ciara Technologies 1x16-X7p-L- team/X7P-MB, BIOS P2.12 03/05/2019 [ 125.715578] Call Trace: [ 125.718038] dump_stack+0x91/0xeb [ 125.721364] print_circular_bug.isra.16+0x30b/0x5b0 [ 125.726249] ? save_trace+0xd6/0x240 [ 125.729827] __lock_acquire+0x41be/0x4f10 [ 125.733841] ? debug_show_all_locks+0x2d0/0x2d0 [ 125.738373] ? debug_show_all_locks+0x2d0/0x2d0 [ 125.742907] ? kernel_text_address+0x6d/0x100 [ 125.747276] ? __save_stack_trace+0x73/0xd0 [ 125.751468] ? quarantine_put+0xb0/0x150 [ 125.755396] ? lock_acquire+0x153/0x330 [ 125.759242] lock_acquire+0x153/0x330 [ 125.762966] ? txg_kick+0x61/0x420 [zfs] [ 125.766896] ? lock_contended+0xd60/0xd60 [ 125.770958] ? txg_kick+0x61/0x420 [zfs] [ 125.774885] __mutex_lock+0xef/0x1380 [ 125.778603] ? txg_kick+0x61/0x420 [zfs] [ 125.782572] ? txg_kick+0x61/0x420 [zfs] [ 125.786500] ? __mutex_add_waiter+0x160/0x160 [ 125.790866] ? sched_clock+0x5/0x10 [ 125.794359] ? __mutex_add_waiter+0x160/0x160 [ 125.798721] ? __cv_destroy+0x9e/0x180 [spl] [ 125.803002] ? cv_destroy_wakeup+0xc0/0xc0 [spl] [ 125.807628] ? __mutex_unlock_slowpath+0xf3/0x660 [ 125.812384] ? zio_wait+0x419/0x660 [zfs] [ 125.816442] ? txg_kick+0x61/0x420 [zfs] [ 125.820412] txg_kick+0x61/0x420 [zfs] [ 125.824209] dsl_pool_need_dirty_delay+0x1c7/0x3f0 [zfs] [ 125.829563] dmu_tx_assign+0x13b/0xe40 [zfs] [ 125.833872] ? dmu_tx_count_write+0x42f/0x700 [zfs] [ 125.838800] zfs_write+0xb5e/0x1f70 [zfs] [ 125.842816] ? sched_clock+0x5/0x10 [ 125.846358] ? zfs_close+0x1e0/0x1e0 [zfs] [ 125.850457] ? sched_clock+0x5/0x10 [ 125.853951] ? sched_clock_cpu+0x18/0x170 [ 125.857971] ? check_flags.part.23+0x480/0x480 [ 125.862418] ? __lock_acquire+0xe3b/0x4f10 [ 125.866526] ? __fget+0x2d6/0x3e0 [ 125.869852] ? sched_clock+0x5/0x10 [ 125.873345] ? sched_clock_cpu+0x18/0x170 [ 125.877402] zpl_write_common_iovec+0x15e/0x380 [zfs] [ 125.882453] ? __lock_acquire+0xe3b/0x4f10 [ 125.886596] ? zpl_read_common_iovec+0x2b0/0x2b0 [zfs] [ 125.891779] zpl_iter_write+0x1c6/0x2a0 [zfs] [ 125.896139] __vfs_write+0x3a2/0x5b0 [ 125.899717] ? __fdget_pos+0xb1/0xe0 [ 125.903297] ? kernel_read+0x130/0x130 [ 125.907049] ? __mutex_add_waiter+0x160/0x160 [ 125.911407] ? lock_acquire+0x153/0x330 [ 125.915248] ? lock_acquire+0x153/0x330 [ 125.919089] ? rcu_read_lock_sched_held+0xeb/0x120 [ 125.923887] ? rcu_sync_lockdep_assert+0x76/0xb0 [ 125.928509] ? __sb_start_write+0xf0/0x260 [ 125.932614] vfs_write+0x15d/0x460 [ 125.936022] ksys_write+0xb1/0x170 [ 125.939425] ? __ia32_sys_read+0xb0/0xb0 [ 125.943353] ? poll_select_set_timeout+0xc0/0xc0 [ 125.947974] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 125.952601] ? do_int80_syscall_32+0x17/0x460 [ 125.956960] do_int80_syscall_32+0xf1/0x460 [ 125.961145] entry_INT80_compat+0x8a/0xa0 diff --git a/include/sys/bpobj.h b/include/sys/bpobj.h index d425e239f..ababcd132 100644 --- a/include/sys/bpobj.h +++ b/include/sys/bpobj.h @@ -55,6 +55,7 @@ typedef struct bpobj_phys { typedef struct bpobj { kmutex_t bpo_lock; + int bpo_id; objset_t *bpo_os; uint64_t bpo_object; int bpo_epb; diff --git a/module/zfs/bpobj.c b/module/zfs/bpobj.c index 633801956..a4d00f510 100644 --- a/module/zfs/bpobj.c +++ b/module/zfs/bpobj.c @@ -143,6 +143,8 @@ out: VERIFY3U(0, ==, dmu_object_free(os, obj, tx)); } +static atomic_t next_bpo_id = ATOMIC_INIT(0); + int bpobj_open(bpobj_t *bpo, objset_t *os, uint64_t object) { @@ -172,6 +174,7 @@ bpobj_open(bpobj_t *bpo, objset_t *os, uint64_t object) bpo->bpo_havecomp = (doi.doi_bonus_size > BPOBJ_SIZE_V0); @@ -172,6 +174,7 @@ bpobj_open(bpobj_t *bpo, objset_t *os, uint64_t object) bpo->bpo_havecomp = (doi.doi_bonus_size > BPOBJ_SIZE_V0); bpo->bpo_havesubobj = (doi.doi_bonus_size > BPOBJ_SIZE_V1); bpo->bpo_phys = bpo->bpo_dbuf->db_data; + bpo->bpo_id = atomic_inc_return(&next_bpo_id); return (0); } @@ -818,7 +821,7 @@ int bpobj_space(bpobj_t *bpo, uint64_t *usedp, uint64_t *compp, uint64_t *uncompp) { ASSERT(bpobj_is_open(bpo)); - mutex_enter(&bpo->bpo_lock); + mutex_enter_nested(&bpo->bpo_lock, bpo->bpo_id); *usedp = bpo->bpo_phys->bpo_bytes; if (bpo->bpo_havecomp) {

…

On Wed, 2019-07-03 at 15:49 -0700, Brian Behlendorf wrote: Would you mind posting the lockdep analysis somewhere, I'd like to take a look. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

behlendorf · 2019-07-09T00:55:28Z

@jdike I see, your analysis looks correct to me, and the proposed change should make lockdep happy.

That said, now that I can see all the stacks the one calling txg_kick() concerns me a bit. This shouldn't need to be called under the dp->dp_lock lock, that lock is there to protect dp->dp_dirty_total and dp->dp_dirty_total. What if instead we did something like this, which should break the cycle and conform to the expected ordering (entirely untested).

diff --git a/module/zfs/dsl_pool.c b/module/zfs/dsl_pool.c
index 864376c..7da78f7 100644
--- a/module/zfs/dsl_pool.c
+++ b/module/zfs/dsl_pool.c
@@ -888,14 +888,17 @@ dsl_pool_need_dirty_delay(dsl_pool_t *dp)
            zfs_dirty_data_max * zfs_delay_min_dirty_percent / 100;
        uint64_t dirty_min_bytes =
            zfs_dirty_data_max * zfs_dirty_data_sync_percent / 100;
-       boolean_t rv;
+       boolean_t do_kick, do_delay;
 
        mutex_enter(&dp->dp_lock);
-       if (dp->dp_dirty_total > dirty_min_bytes)
-               txg_kick(dp);
-       rv = (dp->dp_dirty_total > delay_min_bytes);
+       do_kick = (dp->dp_dirty_total > dirty_min_bytes);
+       do_delay = (dp->dp_dirty_total > delay_min_bytes);
        mutex_exit(&dp->dp_lock);
-       return (rv);
+
+       if (kick)
+               txg_kick(dp);
+
+       return (do_delay);
 }
 
 void

jdike · 2019-07-09T13:45:25Z

Hi Brian, Thanks for looking at my commit. I'll test your change here and if I don't see any new problems, I'll send a Tested-By. Jeff

behlendorf · 2019-07-12T19:12:24Z

@jdike let's move forward with this PR as is since it addresses one lockdep failure. We can follow up with a second PR with the alternate fix proposed above assuming it passes your testing.

behlendorf · 2019-07-15T20:48:04Z

Would you mind rebasing this on master and adding your signed-off-by.

lockdep reports a possible recursive lock in dbuf_destroy. It is true that dbuf_destroy is acquiring the dn_dbufs_mtx on one dnode while holding it on another dnode. However, it is impossible for these to be the same dnode because, among other things,dbuf_destroy checks MUTEX_HELD before acquiring the mutex. This fix defines a class NESTED_SINGLE == 1 and changes that lock to call mutex_enter_nested with a subclass of NESTED_SINGLE. In order to make the userspace code compile, include/sys/zfs_context.h now defines mutex_enter_nested and NESTED_SINGLE. This is the lockdep report: [ 122.950921] ============================================ [ 122.950921] WARNING: possible recursive locking detected [ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 openzfs#1 Tainted: G O [ 122.950921] -------------------------------------------- [ 122.950921] dbu_evict/1457 is trying to acquire lock: [ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] but task is already holding lock: [ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] other info that might help us debug this: [ 122.950921] Possible unsafe locking scenario: [ 122.950921] CPU0 [ 122.950921] ---- [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] *** DEADLOCK *** [ 122.950921] May be due to missing lock nesting notation [ 122.950921] 1 lock held by dbu_evict/1457: [ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] stack backtrace: [ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 openzfs#1 [ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009 [ 122.950921] Call Trace: [ 122.950921] dump_stack+0x91/0xeb [ 122.950921] __lock_acquire+0x2ca7/0x4f10 [ 122.950921] ? debug_show_all_locks+0x2d0/0x2d0 [ 122.950921] ? debug_show_all_locks+0x2d0/0x2d0 [ 122.950921] ? sched_clock_local+0xd8/0x130 [ 122.950921] ? sched_clock_cpu+0x133/0x170 [ 122.950921] ? arc_buf_destroy+0x21f/0x440 [zfs] [ 122.950921] ? __lock_acquire+0xe3b/0x4f10 [ 122.950921] ? lock_acquire+0x153/0x330 [ 122.950921] lock_acquire+0x153/0x330 [ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] ? dbuf_destroy+0x1e2/0xdb0 [zfs] [ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] __mutex_lock+0xef/0x1380 [ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] ? __mutex_add_waiter+0x160/0x160 [ 122.950921] ? sched_clock_cpu+0x133/0x170 [ 122.950921] ? __mutex_unlock_slowpath+0xf3/0x660 [ 122.950921] ? check_flags.part.23+0x480/0x480 [ 122.950921] ? zrl_add_impl+0x7e/0x380 [zfs] [ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs] [ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs] [ 122.950921] ? dbuf_evict_one+0x3d0/0x3d0 [zfs] [ 122.950921] ? dbuf_destroy+0x8c4/0xdb0 [zfs] [ 122.950921] ? rcu_read_lock_sched_held+0xeb/0x120 [ 122.950921] ? kmem_cache_free+0x1b5/0x1f0 [ 122.950921] ? arc_space_return+0x7e/0x180 [zfs] [ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs] [ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs] [ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs] [ 122.950921] taskq_thread+0x979/0x1480 [spl] [ 122.950921] ? taskq_thread_should_stop+0x200/0x200 [spl] [ 122.950921] ? debug_show_all_locks+0x2d0/0x2d0 [ 122.950921] ? wake_up_q+0xf0/0xf0 [ 122.950921] ? sched_clock_local+0xd8/0x130 [ 122.950921] ? dsl_dataset_check_quota+0x8a0/0x8a0 [zfs] [ 122.950921] ? __kthread_parkme+0xad/0x180 [ 122.950921] ? taskq_thread_should_stop+0x200/0x200 [spl] [ 122.950921] kthread+0x2e7/0x3e0 [ 122.950921] ? kthread_park+0x120/0x120 [ 122.950921] ret_from_fork+0x27/0x50 Signed-off-by: Jeff Dike <jdike@akamai.com>

codecov · 2019-07-17T04:51:16Z

Codecov Report

Merging #8984 into master will increase coverage by 0.04%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #8984      +/-   ##
==========================================
+ Coverage   78.64%   78.69%   +0.04%     
==========================================
  Files         402      402              
  Lines      121004   120976      -28     
==========================================
+ Hits        95165    95203      +38     
+ Misses      25839    25773      -66

Flag	Coverage Δ
#kernel	`79.56% <100%> (+0.01%)`	⬆️
#user	`66.38% <100%> (+0.34%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d45d7f0...a4c6c48. Read the comment docs.

lockdep reports a possible recursive lock in dbuf_destroy. It is true that dbuf_destroy is acquiring the dn_dbufs_mtx on one dnode while holding it on another dnode. However, it is impossible for these to be the same dnode because, among other things,dbuf_destroy checks MUTEX_HELD before acquiring the mutex. This fix defines a class NESTED_SINGLE == 1 and changes that lock to call mutex_enter_nested with a subclass of NESTED_SINGLE. In order to make the userspace code compile, include/sys/zfs_context.h now defines mutex_enter_nested and NESTED_SINGLE. This is the lockdep report: [ 122.950921] ============================================ [ 122.950921] WARNING: possible recursive locking detected [ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 openzfs#1 Tainted: G O [ 122.950921] -------------------------------------------- [ 122.950921] dbu_evict/1457 is trying to acquire lock: [ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] but task is already holding lock: [ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] other info that might help us debug this: [ 122.950921] Possible unsafe locking scenario: [ 122.950921] CPU0 [ 122.950921] ---- [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] *** DEADLOCK *** [ 122.950921] May be due to missing lock nesting notation [ 122.950921] 1 lock held by dbu_evict/1457: [ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] stack backtrace: [ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 openzfs#1 [ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009 [ 122.950921] Call Trace: [ 122.950921] dump_stack+0x91/0xeb [ 122.950921] __lock_acquire+0x2ca7/0x4f10 [ 122.950921] lock_acquire+0x153/0x330 [ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs] [ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs] [ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs] [ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs] [ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs] [ 122.950921] taskq_thread+0x979/0x1480 [spl] [ 122.950921] kthread+0x2e7/0x3e0 [ 122.950921] ret_from_fork+0x27/0x50 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jeff Dike <jdike@akamai.com> Closes openzfs#8984

jdike · 2019-07-25T01:41:39Z

Hi Brian, I have another question. In sa.c, we have: 895 sa = kmem_zalloc(sizeof (sa_os_t), KM_SLEEP); 896 mutex_init(&sa->sa_lock, NULL, MUTEX_DEFAULT, NULL); 897 sa->sa_master_obj = sa_obj; 898 os->os_sa = sa; 899 mutex_enter(&sa->sa_lock); 900 mutex_exit(&os->os_user_ptr_lock); 901 avl_create(&sa->sa_layout_num_tree, layout_num_compare, 902 sizeof (sa_lot_t), offsetof(sa_lot_t, lot_num_node)); This looks to me like allocating an object (line 895), making it globally visible (line 898), immediately locking it (line 899), and initializing it (line 901 and down). This brings up two questions 1 - Aren't 898 and 899 racy, i.e. something else will grab the object before it's locked on 899? That being the case, wouldn't it be safer to switch those two lines. 2 - However, the bigger question is why make it visible and then lock it? Why not initialize it and make it visible when you're done, and not lock it at all here? This is showing up in two more lockdep warnings, and eliminating it will fix both of them. I'm trying to get through our test suite without any ZFS complaints, and more just keep popping up. BTW, I turned your last patch into this: diff --git a/module/zfs/dsl_pool.c b/module/zfs/dsl_pool.c index 49e5279..4930160 100644 --- a/module/zfs/dsl_pool.c +++ b/module/zfs/dsl_pool.c @@ -888,14 +888,14 @@ dsl_pool_need_dirty_delay(dsl_pool_t *dp) zfs_dirty_data_max * zfs_delay_min_dirty_percent / 100; uint64_t dirty_min_bytes = zfs_dirty_data_max * zfs_dirty_data_sync_percent / 100; - boolean_t rv; + uint64_t dirty; mutex_enter(&dp->dp_lock); - if (dp->dp_dirty_total > dirty_min_bytes) - txg_kick(dp); - rv = (dp->dp_dirty_total > delay_min_bytes); + dirty = dp->dp_dirty_total; mutex_exit(&dp->dp_lock); - return (rv); + if (dirty > dirty_min_bytes) + txg_kick(dp); + return(dirty > delay_min_bytes); } void This looks functionally the same and a bit simpler to me. Jeff

…

On 7/8/19 8:55 PM, Brian Behlendorf wrote: @jdike <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_jdike&d=DwMCaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=qoeLHezHQpLd89A5Cxgm7Q&m=Ilzoo5Z1INCOonVHuXQvl01bvZPIsWKK_nLeKu4gOLE&s=ITI1fj2xW73BZr1tEqHPm7C3HCph5l4nlxX9tT0kQ28&e=> I see, your analysis looks correct to me, and the proposed change should make lockdep happy. That said, now that I can see all the stacks the one calling |txg_kick()| concerns me a bit. This shouldn't need to be called under the |dp->dp_lock| lock, that lock is there to protect |dp->dp_dirty_total| and |dp->dp_dirty_total|. What if instead we did something like this, which should break the cycle and conform to the expected ordering (entirely untested). diff --git a/module/zfs/dsl_pool.c b/module/zfs/dsl_pool.c index 864376c..7da78f7 100644 --- a/module/zfs/dsl_pool.c +++ b/module/zfs/dsl_pool.c @@ -888,14 +888,17 @@ dsl_pool_need_dirty_delay(dsl_pool_t *dp) zfs_dirty_data_max * zfs_delay_min_dirty_percent / 100; uint64_t dirty_min_bytes = zfs_dirty_data_max * zfs_dirty_data_sync_percent / 100; - boolean_t rv; + boolean_t do_kick, do_delay; mutex_enter(&dp->dp_lock); - if (dp->dp_dirty_total > dirty_min_bytes) - txg_kick(dp); - rv = (dp->dp_dirty_total > delay_min_bytes); + do_kick = (dp->dp_dirty_total > dirty_min_bytes); + do_delay = (dp->dp_dirty_total > delay_min_bytes); mutex_exit(&dp->dp_lock); - return (rv); + + if (kick) + txg_kick(dp); + + return (do_delay); } void — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zfsonlinux_zfs_pull_8984-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DAMP55AVKFJYWHG7572UCW73P6POZTA5CNFSM4H47B52KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZOYKBA-23issuecomment-2D509445380&d=DwMCaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=qoeLHezHQpLd89A5Cxgm7Q&m=Ilzoo5Z1INCOonVHuXQvl01bvZPIsWKK_nLeKu4gOLE&s=ay6GRsalIcxPrV-8f2v0p5OefLewcfMmkI270jw6KMc&e=>, or mute the thread <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMP55AQR5FDT7MAAPJDFPOTP6POZTANCNFSM4H47B52A&d=DwMCaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=qoeLHezHQpLd89A5Cxgm7Q&m=Ilzoo5Z1INCOonVHuXQvl01bvZPIsWKK_nLeKu4gOLE&s=-aYVI5LUwS8sZoKnTqwDjmbpThGbNXoQemIPiXkFmAw&e=>.

lockdep reports a possible recursive lock in dbuf_destroy. It is true that dbuf_destroy is acquiring the dn_dbufs_mtx on one dnode while holding it on another dnode. However, it is impossible for these to be the same dnode because, among other things,dbuf_destroy checks MUTEX_HELD before acquiring the mutex. This fix defines a class NESTED_SINGLE == 1 and changes that lock to call mutex_enter_nested with a subclass of NESTED_SINGLE. In order to make the userspace code compile, include/sys/zfs_context.h now defines mutex_enter_nested and NESTED_SINGLE. This is the lockdep report: [ 122.950921] ============================================ [ 122.950921] WARNING: possible recursive locking detected [ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 #1 Tainted: G O [ 122.950921] -------------------------------------------- [ 122.950921] dbu_evict/1457 is trying to acquire lock: [ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] but task is already holding lock: [ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] other info that might help us debug this: [ 122.950921] Possible unsafe locking scenario: [ 122.950921] CPU0 [ 122.950921] ---- [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] *** DEADLOCK *** [ 122.950921] May be due to missing lock nesting notation [ 122.950921] 1 lock held by dbu_evict/1457: [ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] stack backtrace: [ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 #1 [ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009 [ 122.950921] Call Trace: [ 122.950921] dump_stack+0x91/0xeb [ 122.950921] __lock_acquire+0x2ca7/0x4f10 [ 122.950921] lock_acquire+0x153/0x330 [ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs] [ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs] [ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs] [ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs] [ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs] [ 122.950921] taskq_thread+0x979/0x1480 [spl] [ 122.950921] kthread+0x2e7/0x3e0 [ 122.950921] ret_from_fork+0x27/0x50 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jeff Dike <jdike@akamai.com> Closes openzfs#8984

lockdep reports a possible recursive lock in dbuf_destroy. It is true that dbuf_destroy is acquiring the dn_dbufs_mtx on one dnode while holding it on another dnode. However, it is impossible for these to be the same dnode because, among other things,dbuf_destroy checks MUTEX_HELD before acquiring the mutex. This fix defines a class NESTED_SINGLE == 1 and changes that lock to call mutex_enter_nested with a subclass of NESTED_SINGLE. In order to make the userspace code compile, include/sys/zfs_context.h now defines mutex_enter_nested and NESTED_SINGLE. This is the lockdep report: [ 122.950921] ============================================ [ 122.950921] WARNING: possible recursive locking detected [ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 #1 Tainted: G O [ 122.950921] -------------------------------------------- [ 122.950921] dbu_evict/1457 is trying to acquire lock: [ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] but task is already holding lock: [ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] other info that might help us debug this: [ 122.950921] Possible unsafe locking scenario: [ 122.950921] CPU0 [ 122.950921] ---- [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] *** DEADLOCK *** [ 122.950921] May be due to missing lock nesting notation [ 122.950921] 1 lock held by dbu_evict/1457: [ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] stack backtrace: [ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 #1 [ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009 [ 122.950921] Call Trace: [ 122.950921] dump_stack+0x91/0xeb [ 122.950921] __lock_acquire+0x2ca7/0x4f10 [ 122.950921] lock_acquire+0x153/0x330 [ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs] [ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs] [ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs] [ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs] [ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs] [ 122.950921] taskq_thread+0x979/0x1480 [spl] [ 122.950921] kthread+0x2e7/0x3e0 [ 122.950921] ret_from_fork+0x27/0x50 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jeff Dike <jdike@akamai.com> Closes #8984

behlendorf added the Status: Code Review Needed Ready for review and testing label Jul 3, 2019

behlendorf approved these changes Jul 3, 2019

View reviewed changes

behlendorf requested a review from tonyhutter July 3, 2019 20:52

behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jul 13, 2019

tonyhutter approved these changes Jul 15, 2019

View reviewed changes

jdike force-pushed the for-upstream branch from fba56ce to a4c6c48 Compare July 16, 2019 20:14

behlendorf merged commit a649768 into openzfs:master Jul 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix lockdep recursive locking false positive in dbuf_destroy #8984

Fix lockdep recursive locking false positive in dbuf_destroy #8984

jdike commented Jul 2, 2019

behlendorf left a comment

jdike commented Jul 3, 2019 via email

behlendorf commented Jul 3, 2019

behlendorf commented Jul 5, 2019

jdike commented Jul 8, 2019 via email

behlendorf commented Jul 9, 2019

jdike commented Jul 9, 2019 via email

behlendorf commented Jul 12, 2019

behlendorf commented Jul 15, 2019

codecov bot commented Jul 17, 2019 •

edited

Loading

jdike commented Jul 25, 2019 via email

Fix lockdep recursive locking false positive in dbuf_destroy #8984

Fix lockdep recursive locking false positive in dbuf_destroy #8984

Conversation

jdike commented Jul 2, 2019

How Has This Been Tested?

Types of changes

Checklist:

behlendorf left a comment

Choose a reason for hiding this comment

jdike commented Jul 3, 2019 via email

behlendorf commented Jul 3, 2019

behlendorf commented Jul 5, 2019

jdike commented Jul 8, 2019 via email

behlendorf commented Jul 9, 2019

jdike commented Jul 9, 2019 via email

behlendorf commented Jul 12, 2019

behlendorf commented Jul 15, 2019

codecov bot commented Jul 17, 2019 • edited Loading

Codecov Report

jdike commented Jul 25, 2019 via email

codecov bot commented Jul 17, 2019 •

edited

Loading