-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix lockdep recursive locking false positive in dbuf_destroy #8984
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Was this the only issue reported by lockdep? I ask because I though we still had a couple other false positives which had not yet been resolved. But if that's not the case, then we shoudl close them. See:
https://github.com/zfsonlinux/zfs/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+%22lockdep%3A%22+
Hi Brian,
No, it's not the only issue. It is the simpler of the two that I have.
I'm having trouble building my fix for the second. It's simple enough, but I haven't figured out how to get atomic_t references to build in the non-SPL part of ZFS,
Here's what I'm trying to do (this is RFC material, so I don't care if Exchange mangles it...):
diff --git a/include/sys/bpobj.h b/include/sys/bpobj.h
index d425e239f..ababcd132 100644
--- a/include/sys/bpobj.h
+++ b/include/sys/bpobj.h
@@ -55,6 +55,7 @@ typedef struct bpobj_phys {
typedef struct bpobj {
kmutex_t bpo_lock;
+ int bpo_id;
objset_t *bpo_os;
uint64_t bpo_object;
int bpo_epb;
diff --git a/module/zfs/bpobj.c b/module/zfs/bpobj.c
index 633801956..a4d00f510 100644
--- a/module/zfs/bpobj.c
+++ b/module/zfs/bpobj.c
@@ -143,6 +143,8 @@ out:
VERIFY3U(0, ==, dmu_object_free(os, obj, tx));
}
+static atomic_t next_bpo_id = ATOMIC_INIT(0);
+
int
bpobj_open(bpobj_t *bpo, objset_t *os, uint64_t object)
{
@@ -172,6 +174,7 @@ bpobj_open(bpobj_t *bpo, objset_t *os, uint64_t object)
bpo->bpo_havecomp = (doi.doi_bonus_size > BPOBJ_SIZE_V0);
bpo->bpo_havesubobj = (doi.doi_bonus_size > BPOBJ_SIZE_V1);
bpo->bpo_phys = bpo->bpo_dbuf->db_data;
+ bpo->bpo_id = atomic_inc_return(&next_bpo_id);
return (0);
}
@@ -818,7 +821,7 @@ int
bpobj_space(bpobj_t *bpo, uint64_t *usedp, uint64_t *compp, uint64_t *uncompp)
{
ASSERT(bpobj_is_open(bpo));
- mutex_enter(&bpo->bpo_lock);
+ mutex_enter_nested(&bpo->bpo_lock, bpo->bpo_id);
*usedp = bpo->bpo_phys->bpo_bytes;
if (bpo->bpo_havecomp) {
The situation (which I have described in much more detail in my local commit) is that lockdep is seeing possible circular locking in a path which involves two different txg_sync_threads. The reasoning that lets me conclude that this is a false positive is that those two txg_sync_threads must be referring to different lock bpo_lock instances. The patch above should be pretty self-explanatory.
My problem is that I can't get this to build in ZFS git. Including <sys/atomic.h> doesn't work in bpobj.c, even though it does in other files, such as zfs_znode.c. The include is inside #ifdef KERNEL there, and I don't see how that matters because the Makefile treats bpobj.c and zfs_znode.c the same.
Any clues gratefully accepted...
Jeff
On Wed, 2019-07-03 at 09:43 -0700, Brian Behlendorf wrote:
@behlendorf approved this pull request.
Looks good. Was this the only issue reported by lockdep? I ask because I though we still had a couple other false positives which had not yet been resolved. But if that's not the case, then we shoudl close them. See:
https://github.com/zfsonlinux/zfs/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+%22lockdep%3A%22+<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zfsonlinux_zfs_issues-3Futf8-3D-25E2-259C-2593-26q-3Dis-253Aissue-2Bis-253Aopen-2B-2522lockdep-253A-2522-2B&d=DwMCaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=qoeLHezHQpLd89A5Cxgm7Q&m=pUuFCi91MOa7iVeNGhfwDM2HfR7ObRO1GHbkliysLrM&s=Kzsezaxq-BApxBHepGTURtbWYJv32_oHX8kklhInRjQ&e=>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zfsonlinux_zfs_pull_8984-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DAMP55AR4OSYRKXVA2TFX55DP5TJKLA5CNFSM4H47B52KYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOB5NN4XQ-23pullrequestreview-2D257613406&d=DwMCaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=qoeLHezHQpLd89A5Cxgm7Q&m=pUuFCi91MOa7iVeNGhfwDM2HfR7ObRO1GHbkliysLrM&s=KRlmeNTrs9WCjuGOxfoAcpI4pldHMfasTNb0ahbeY2s&e=>, or mute the thread<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMP55AXDLXCVJIZIXP3AWRTP5TJKLANCNFSM4H47B52A&d=DwMCaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=qoeLHezHQpLd89A5Cxgm7Q&m=pUuFCi91MOa7iVeNGhfwDM2HfR7ObRO1GHbkliysLrM&s=jhHF5vsOePu8wFwZOzC9k2qWLtDUZfXM3_QT37xrIp0&e=>.
|
Would you mind posting the lockdep analysis somewhere, I'd like to take a look. |
@jdike you should also be able to silence a false positive by using the |
Hi Brian,
My local commit is below...
Jeff
commit d27d578 (HEAD -> locking)
Author: Jeff Dike <jdike@akamai.com>
Date: Mon Jul 1 11:16:55 2019 -0400
Quiet lockdep circular lock warning
This is a false positive. The full lockdep warning is below. The
jist of it is:
the existing dependency chain (in reverse order) is:
-> #3 (&dp->dp_lock){+.+.}:
dsl_pool_dirty_space+0x70/0x2d0 [zfs]
dbuf_dirty+0x778/0x31d0 [zfs]
-> #2 (&db->db_mtx){+.+.}:
dmu_buf_will_dirty+0x6b/0x6c0 [zfs]
bpobj_iterate_impl+0xbe6/0x1410 [zfs]
spa_sync+0x1756/0x29b0 [zfs]
txg_sync_thread+0x6d6/0x1370 [zfs]
-> #1 (&bpo->bpo_lock){+.+.}:
bpobj_space+0x63/0x470 [zfs]
dsl_scan_active+0x340/0x3d0 [zfs]
txg_sync_thread+0x3f2/0x1370 [zfs]
-> #0 (&tx->tx_sync_lock){+.+.}:
__mutex_lock+0xef/0x1380
txg_kick+0x61/0x420 [zfs]
dsl_pool_need_dirty_delay+0x1c7/0x3f0 [zfs]
Different instances of txg_sync_thread manage different dsl_pools,
which contain different bpo_lock instances. Stacks #1 and #2
therefore can't be referring to the same bpo_lock instance, so this
can't be an actual loop.
The patch adds an ID field (bpo_id) to bpo_lock which is unique to
each
instance. This bpo_id is passed as the class to the
mutex_enter_nested call in bpobj_space. This effectively makes
this
locking call invisible to lockdep.
[ 125.297679]
======================================================
[ 125.301361] hardirqs last enabled at (28216159):
[<ffffffff885cef12>] handle_mm_fault+0x582/0x7d0
[ 125.302863] WARNING: possible circular locking dependency
detected
[ 125.302865] 4.19.29-4.19.0-debug-99d9c44b25c08f51 #1 Tainted:
G O
[ 125.309056] hardirqs last disabled at (28216160):
[<ffffffff88530f22>] get_page_from_freelist+0x3b2/0x31a0
[ 125.318009] ----------------------------------------------------
--
[ 125.318011] mdtagent/13103 is trying to acquire lock:
[ 125.324203] softirqs last enabled at (28215144):
[<ffffffff89c00631>] __do_softirq+0x631/0x868
[ 125.331591] 00000000679a2c2b (
[ 125.341249] softirqs last disabled at (28215139):
[<ffffffff8813ef80>] irq_exit+0x150/0x170
[ 125.347425] &tx->tx_sync_lock){+.+.}, at: txg_kick+0x61/0x420
[zfs]
[ 125.378920]
but task is already holding lock:
[ 125.384762] 00000000ec7e955e (&dp->dp_lock){+.+.}, at:
dsl_pool_need_dirty_delay+0x8b/0x3f0 [zfs]
[ 125.393679]
which lock already depends on the new lock.
[ 125.401861]
the existing dependency chain (in reverse order) is:
[ 125.409347]
-> #3 (&dp->dp_lock){+.+.}:
[ 125.414710] dsl_pool_dirty_space+0x70/0x2d0 [zfs]
[ 125.420068] dbuf_dirty+0x778/0x31d0 [zfs]
[ 125.424723] dmu_write_uio_dnode+0xfe/0x300 [zfs]
[ 125.429990] dmu_write_uio_dbuf+0xa0/0x100 [zfs]
[ 125.435174] zfs_write+0x18bd/0x1f70 [zfs]
[ 125.439837] zpl_write_common_iovec+0x15e/0x380 [zfs]
[ 125.445452] zpl_iter_write+0x1c6/0x2a0 [zfs]
[ 125.450334] __vfs_write+0x3a2/0x5b0
[ 125.454440] vfs_write+0x15d/0x460
[ 125.458365] ksys_write+0xb1/0x170
[ 125.462295] do_syscall_64+0x9b/0x400
[ 125.466491] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 125.472069]
-> #2 (&db->db_mtx){+.+.}:
[ 125.477337] dmu_buf_will_dirty+0x6b/0x6c0 [zfs]
[ 125.482509] bpobj_iterate_impl+0xbe6/0x1410 [zfs]
[ 125.487863] spa_sync+0x1756/0x29b0 [zfs]
[ 125.492442] txg_sync_thread+0x6d6/0x1370 [zfs]
[ 125.497499] thread_generic_wrapper+0x1ff/0x2a0 [spl]
[ 125.503080] kthread+0x2e7/0x3e0
[ 125.506839] ret_from_fork+0x24/0x50
[ 125.510937]
-> #1 (&bpo->bpo_lock){+.+.}:
[ 125.516464] bpobj_space+0x63/0x470 [zfs]
[ 125.521038] dsl_scan_active+0x340/0x3d0 [zfs]
[ 125.526049] txg_sync_thread+0x3f2/0x1370 [zfs]
[ 125.531106] thread_generic_wrapper+0x1ff/0x2a0 [spl]
[ 125.536684] kthread+0x2e7/0x3e0
[ 125.540439] ret_from_fork+0x24/0x50
[ 125.544535]
-> #0 (&tx->tx_sync_lock){+.+.}:
[ 125.550295] __mutex_lock+0xef/0x1380
[ 125.554529] txg_kick+0x61/0x420 [zfs]
[ 125.558841] dsl_pool_need_dirty_delay+0x1c7/0x3f0 [zfs]
[ 125.564715] dmu_tx_assign+0x13b/0xe40 [zfs]
[ 125.569557] zfs_write+0xb5e/0x1f70 [zfs]
[ 125.574130] zpl_write_common_iovec+0x15e/0x380 [zfs]
[ 125.579746] zpl_iter_write+0x1c6/0x2a0 [zfs]
[ 125.584628] __vfs_write+0x3a2/0x5b0
[ 125.588726] vfs_write+0x15d/0x460
[ 125.592654] ksys_write+0xb1/0x170
[ 125.596580] do_int80_syscall_32+0xf1/0x460
[ 125.601287] entry_INT80_compat+0x8a/0xa0
[ 125.605817]
other info that might help us debug this:
[ 125.613825] Chain exists of:
&tx->tx_sync_lock --> &db->db_mtx --> &dp->dp_lock
[ 125.624174] Possible unsafe locking scenario:
[ 125.630099] CPU0 CPU1
[ 125.634633] ---- ----
[ 125.639166] lock(&dp->dp_lock);
[ 125.642485] lock(&db->db_mtx);
[ 125.648232] lock(&dp->dp_lock);
[ 125.654071] lock(&tx->tx_sync_lock);
[ 125.657823]
*** DEADLOCK ***
[ 125.663743] 3 locks held by mdtagent/13103:
[ 125.667928] #0: 0000000016b80a9c (&f->f_pos_lock){+.+.}, at:
__fdget_pos+0xb1/0xe0
[ 125.675591] #1: 00000000d0227133 (sb_writers#12){.+.+}, at:
vfs_write+0x332/0x460
[ 125.683167] #2: 00000000ec7e955e (&dp->dp_lock){+.+.}, at:
dsl_pool_need_dirty_delay+0x8b/0x3f0 [zfs]
[ 125.692517]
stack backtrace:
[ 125.696878] CPU: 28 PID: 13103 Comm: mdtagent Tainted:
G O 4.19.29-4.19.0-debug-99d9c44b25c08f51 #1
[ 125.707226] Hardware name: Ciara Technologies 1x16-X7p-L-
team/X7P-MB, BIOS P2.12 03/05/2019
[ 125.715578] Call Trace:
[ 125.718038] dump_stack+0x91/0xeb
[ 125.721364] print_circular_bug.isra.16+0x30b/0x5b0
[ 125.726249] ? save_trace+0xd6/0x240
[ 125.729827] __lock_acquire+0x41be/0x4f10
[ 125.733841] ? debug_show_all_locks+0x2d0/0x2d0
[ 125.738373] ? debug_show_all_locks+0x2d0/0x2d0
[ 125.742907] ? kernel_text_address+0x6d/0x100
[ 125.747276] ? __save_stack_trace+0x73/0xd0
[ 125.751468] ? quarantine_put+0xb0/0x150
[ 125.755396] ? lock_acquire+0x153/0x330
[ 125.759242] lock_acquire+0x153/0x330
[ 125.762966] ? txg_kick+0x61/0x420 [zfs]
[ 125.766896] ? lock_contended+0xd60/0xd60
[ 125.770958] ? txg_kick+0x61/0x420 [zfs]
[ 125.774885] __mutex_lock+0xef/0x1380
[ 125.778603] ? txg_kick+0x61/0x420 [zfs]
[ 125.782572] ? txg_kick+0x61/0x420 [zfs]
[ 125.786500] ? __mutex_add_waiter+0x160/0x160
[ 125.790866] ? sched_clock+0x5/0x10
[ 125.794359] ? __mutex_add_waiter+0x160/0x160
[ 125.798721] ? __cv_destroy+0x9e/0x180 [spl]
[ 125.803002] ? cv_destroy_wakeup+0xc0/0xc0 [spl]
[ 125.807628] ? __mutex_unlock_slowpath+0xf3/0x660
[ 125.812384] ? zio_wait+0x419/0x660 [zfs]
[ 125.816442] ? txg_kick+0x61/0x420 [zfs]
[ 125.820412] txg_kick+0x61/0x420 [zfs]
[ 125.824209] dsl_pool_need_dirty_delay+0x1c7/0x3f0 [zfs]
[ 125.829563] dmu_tx_assign+0x13b/0xe40 [zfs]
[ 125.833872] ? dmu_tx_count_write+0x42f/0x700 [zfs]
[ 125.838800] zfs_write+0xb5e/0x1f70 [zfs]
[ 125.842816] ? sched_clock+0x5/0x10
[ 125.846358] ? zfs_close+0x1e0/0x1e0 [zfs]
[ 125.850457] ? sched_clock+0x5/0x10
[ 125.853951] ? sched_clock_cpu+0x18/0x170
[ 125.857971] ? check_flags.part.23+0x480/0x480
[ 125.862418] ? __lock_acquire+0xe3b/0x4f10
[ 125.866526] ? __fget+0x2d6/0x3e0
[ 125.869852] ? sched_clock+0x5/0x10
[ 125.873345] ? sched_clock_cpu+0x18/0x170
[ 125.877402] zpl_write_common_iovec+0x15e/0x380 [zfs]
[ 125.882453] ? __lock_acquire+0xe3b/0x4f10
[ 125.886596] ? zpl_read_common_iovec+0x2b0/0x2b0 [zfs]
[ 125.891779] zpl_iter_write+0x1c6/0x2a0 [zfs]
[ 125.896139] __vfs_write+0x3a2/0x5b0
[ 125.899717] ? __fdget_pos+0xb1/0xe0
[ 125.903297] ? kernel_read+0x130/0x130
[ 125.907049] ? __mutex_add_waiter+0x160/0x160
[ 125.911407] ? lock_acquire+0x153/0x330
[ 125.915248] ? lock_acquire+0x153/0x330
[ 125.919089] ? rcu_read_lock_sched_held+0xeb/0x120
[ 125.923887] ? rcu_sync_lockdep_assert+0x76/0xb0
[ 125.928509] ? __sb_start_write+0xf0/0x260
[ 125.932614] vfs_write+0x15d/0x460
[ 125.936022] ksys_write+0xb1/0x170
[ 125.939425] ? __ia32_sys_read+0xb0/0xb0
[ 125.943353] ? poll_select_set_timeout+0xc0/0xc0
[ 125.947974] ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 125.952601] ? do_int80_syscall_32+0x17/0x460
[ 125.956960] do_int80_syscall_32+0xf1/0x460
[ 125.961145] entry_INT80_compat+0x8a/0xa0
diff --git a/include/sys/bpobj.h b/include/sys/bpobj.h
index d425e239f..ababcd132 100644
--- a/include/sys/bpobj.h
+++ b/include/sys/bpobj.h
@@ -55,6 +55,7 @@ typedef struct bpobj_phys {
typedef struct bpobj {
kmutex_t bpo_lock;
+ int bpo_id;
objset_t *bpo_os;
uint64_t bpo_object;
int bpo_epb;
diff --git a/module/zfs/bpobj.c b/module/zfs/bpobj.c
index 633801956..a4d00f510 100644
--- a/module/zfs/bpobj.c
+++ b/module/zfs/bpobj.c
@@ -143,6 +143,8 @@ out:
VERIFY3U(0, ==, dmu_object_free(os, obj, tx));
}
+static atomic_t next_bpo_id = ATOMIC_INIT(0);
+
int
bpobj_open(bpobj_t *bpo, objset_t *os, uint64_t object)
{
@@ -172,6 +174,7 @@ bpobj_open(bpobj_t *bpo, objset_t *os, uint64_t
object)
bpo->bpo_havecomp = (doi.doi_bonus_size > BPOBJ_SIZE_V0);
@@ -172,6 +174,7 @@ bpobj_open(bpobj_t *bpo, objset_t *os, uint64_t
object)
bpo->bpo_havecomp = (doi.doi_bonus_size > BPOBJ_SIZE_V0);
bpo->bpo_havesubobj = (doi.doi_bonus_size > BPOBJ_SIZE_V1);
bpo->bpo_phys = bpo->bpo_dbuf->db_data;
+ bpo->bpo_id = atomic_inc_return(&next_bpo_id);
return (0);
}
@@ -818,7 +821,7 @@ int
bpobj_space(bpobj_t *bpo, uint64_t *usedp, uint64_t *compp, uint64_t
*uncompp)
{
ASSERT(bpobj_is_open(bpo));
- mutex_enter(&bpo->bpo_lock);
+ mutex_enter_nested(&bpo->bpo_lock, bpo->bpo_id);
*usedp = bpo->bpo_phys->bpo_bytes;
if (bpo->bpo_havecomp) {
…On Wed, 2019-07-03 at 15:49 -0700, Brian Behlendorf wrote:
Would you mind posting the lockdep analysis somewhere, I'd like to
take a look.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@jdike I see, your analysis looks correct to me, and the proposed change should make lockdep happy. That said, now that I can see all the stacks the one calling diff --git a/module/zfs/dsl_pool.c b/module/zfs/dsl_pool.c
index 864376c..7da78f7 100644
--- a/module/zfs/dsl_pool.c
+++ b/module/zfs/dsl_pool.c
@@ -888,14 +888,17 @@ dsl_pool_need_dirty_delay(dsl_pool_t *dp)
zfs_dirty_data_max * zfs_delay_min_dirty_percent / 100;
uint64_t dirty_min_bytes =
zfs_dirty_data_max * zfs_dirty_data_sync_percent / 100;
- boolean_t rv;
+ boolean_t do_kick, do_delay;
mutex_enter(&dp->dp_lock);
- if (dp->dp_dirty_total > dirty_min_bytes)
- txg_kick(dp);
- rv = (dp->dp_dirty_total > delay_min_bytes);
+ do_kick = (dp->dp_dirty_total > dirty_min_bytes);
+ do_delay = (dp->dp_dirty_total > delay_min_bytes);
mutex_exit(&dp->dp_lock);
- return (rv);
+
+ if (kick)
+ txg_kick(dp);
+
+ return (do_delay);
}
void |
Hi Brian,
Thanks for looking at my commit.
I'll test your change here and if I don't see any new problems, I'll
send a Tested-By.
Jeff
|
@jdike let's move forward with this PR as is since it addresses one lockdep failure. We can follow up with a second PR with the alternate fix proposed above assuming it passes your testing. |
Would you mind rebasing this on master and adding your signed-off-by. |
lockdep reports a possible recursive lock in dbuf_destroy. It is true that dbuf_destroy is acquiring the dn_dbufs_mtx on one dnode while holding it on another dnode. However, it is impossible for these to be the same dnode because, among other things,dbuf_destroy checks MUTEX_HELD before acquiring the mutex. This fix defines a class NESTED_SINGLE == 1 and changes that lock to call mutex_enter_nested with a subclass of NESTED_SINGLE. In order to make the userspace code compile, include/sys/zfs_context.h now defines mutex_enter_nested and NESTED_SINGLE. This is the lockdep report: [ 122.950921] ============================================ [ 122.950921] WARNING: possible recursive locking detected [ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 openzfs#1 Tainted: G O [ 122.950921] -------------------------------------------- [ 122.950921] dbu_evict/1457 is trying to acquire lock: [ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] but task is already holding lock: [ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] other info that might help us debug this: [ 122.950921] Possible unsafe locking scenario: [ 122.950921] CPU0 [ 122.950921] ---- [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] *** DEADLOCK *** [ 122.950921] May be due to missing lock nesting notation [ 122.950921] 1 lock held by dbu_evict/1457: [ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] stack backtrace: [ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 openzfs#1 [ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009 [ 122.950921] Call Trace: [ 122.950921] dump_stack+0x91/0xeb [ 122.950921] __lock_acquire+0x2ca7/0x4f10 [ 122.950921] ? debug_show_all_locks+0x2d0/0x2d0 [ 122.950921] ? debug_show_all_locks+0x2d0/0x2d0 [ 122.950921] ? sched_clock_local+0xd8/0x130 [ 122.950921] ? sched_clock_cpu+0x133/0x170 [ 122.950921] ? arc_buf_destroy+0x21f/0x440 [zfs] [ 122.950921] ? __lock_acquire+0xe3b/0x4f10 [ 122.950921] ? lock_acquire+0x153/0x330 [ 122.950921] lock_acquire+0x153/0x330 [ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] ? dbuf_destroy+0x1e2/0xdb0 [zfs] [ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] __mutex_lock+0xef/0x1380 [ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] ? __mutex_add_waiter+0x160/0x160 [ 122.950921] ? sched_clock_cpu+0x133/0x170 [ 122.950921] ? __mutex_unlock_slowpath+0xf3/0x660 [ 122.950921] ? check_flags.part.23+0x480/0x480 [ 122.950921] ? zrl_add_impl+0x7e/0x380 [zfs] [ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs] [ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs] [ 122.950921] ? dbuf_evict_one+0x3d0/0x3d0 [zfs] [ 122.950921] ? dbuf_destroy+0x8c4/0xdb0 [zfs] [ 122.950921] ? rcu_read_lock_sched_held+0xeb/0x120 [ 122.950921] ? kmem_cache_free+0x1b5/0x1f0 [ 122.950921] ? arc_space_return+0x7e/0x180 [zfs] [ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs] [ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs] [ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs] [ 122.950921] taskq_thread+0x979/0x1480 [spl] [ 122.950921] ? taskq_thread_should_stop+0x200/0x200 [spl] [ 122.950921] ? debug_show_all_locks+0x2d0/0x2d0 [ 122.950921] ? wake_up_q+0xf0/0xf0 [ 122.950921] ? sched_clock_local+0xd8/0x130 [ 122.950921] ? dsl_dataset_check_quota+0x8a0/0x8a0 [zfs] [ 122.950921] ? __kthread_parkme+0xad/0x180 [ 122.950921] ? taskq_thread_should_stop+0x200/0x200 [spl] [ 122.950921] kthread+0x2e7/0x3e0 [ 122.950921] ? kthread_park+0x120/0x120 [ 122.950921] ret_from_fork+0x27/0x50 Signed-off-by: Jeff Dike <jdike@akamai.com>
Codecov Report
@@ Coverage Diff @@
## master #8984 +/- ##
==========================================
+ Coverage 78.64% 78.69% +0.04%
==========================================
Files 402 402
Lines 121004 120976 -28
==========================================
+ Hits 95165 95203 +38
+ Misses 25839 25773 -66
Continue to review full report at Codecov.
|
lockdep reports a possible recursive lock in dbuf_destroy. It is true that dbuf_destroy is acquiring the dn_dbufs_mtx on one dnode while holding it on another dnode. However, it is impossible for these to be the same dnode because, among other things,dbuf_destroy checks MUTEX_HELD before acquiring the mutex. This fix defines a class NESTED_SINGLE == 1 and changes that lock to call mutex_enter_nested with a subclass of NESTED_SINGLE. In order to make the userspace code compile, include/sys/zfs_context.h now defines mutex_enter_nested and NESTED_SINGLE. This is the lockdep report: [ 122.950921] ============================================ [ 122.950921] WARNING: possible recursive locking detected [ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 openzfs#1 Tainted: G O [ 122.950921] -------------------------------------------- [ 122.950921] dbu_evict/1457 is trying to acquire lock: [ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] but task is already holding lock: [ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] other info that might help us debug this: [ 122.950921] Possible unsafe locking scenario: [ 122.950921] CPU0 [ 122.950921] ---- [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] *** DEADLOCK *** [ 122.950921] May be due to missing lock nesting notation [ 122.950921] 1 lock held by dbu_evict/1457: [ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] stack backtrace: [ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 openzfs#1 [ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009 [ 122.950921] Call Trace: [ 122.950921] dump_stack+0x91/0xeb [ 122.950921] __lock_acquire+0x2ca7/0x4f10 [ 122.950921] lock_acquire+0x153/0x330 [ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs] [ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs] [ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs] [ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs] [ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs] [ 122.950921] taskq_thread+0x979/0x1480 [spl] [ 122.950921] kthread+0x2e7/0x3e0 [ 122.950921] ret_from_fork+0x27/0x50 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jeff Dike <jdike@akamai.com> Closes openzfs#8984
lockdep reports a possible recursive lock in dbuf_destroy. It is true that dbuf_destroy is acquiring the dn_dbufs_mtx on one dnode while holding it on another dnode. However, it is impossible for these to be the same dnode because, among other things,dbuf_destroy checks MUTEX_HELD before acquiring the mutex. This fix defines a class NESTED_SINGLE == 1 and changes that lock to call mutex_enter_nested with a subclass of NESTED_SINGLE. In order to make the userspace code compile, include/sys/zfs_context.h now defines mutex_enter_nested and NESTED_SINGLE. This is the lockdep report: [ 122.950921] ============================================ [ 122.950921] WARNING: possible recursive locking detected [ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 openzfs#1 Tainted: G O [ 122.950921] -------------------------------------------- [ 122.950921] dbu_evict/1457 is trying to acquire lock: [ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] but task is already holding lock: [ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] other info that might help us debug this: [ 122.950921] Possible unsafe locking scenario: [ 122.950921] CPU0 [ 122.950921] ---- [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] *** DEADLOCK *** [ 122.950921] May be due to missing lock nesting notation [ 122.950921] 1 lock held by dbu_evict/1457: [ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] stack backtrace: [ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 openzfs#1 [ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009 [ 122.950921] Call Trace: [ 122.950921] dump_stack+0x91/0xeb [ 122.950921] __lock_acquire+0x2ca7/0x4f10 [ 122.950921] lock_acquire+0x153/0x330 [ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs] [ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs] [ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs] [ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs] [ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs] [ 122.950921] taskq_thread+0x979/0x1480 [spl] [ 122.950921] kthread+0x2e7/0x3e0 [ 122.950921] ret_from_fork+0x27/0x50 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jeff Dike <jdike@akamai.com> Closes openzfs#8984
Hi Brian,
I have another question. In sa.c, we have:
895 sa = kmem_zalloc(sizeof (sa_os_t), KM_SLEEP);
896 mutex_init(&sa->sa_lock, NULL, MUTEX_DEFAULT, NULL);
897 sa->sa_master_obj = sa_obj;
898 os->os_sa = sa;
899 mutex_enter(&sa->sa_lock);
900 mutex_exit(&os->os_user_ptr_lock);
901 avl_create(&sa->sa_layout_num_tree, layout_num_compare,
902 sizeof (sa_lot_t), offsetof(sa_lot_t, lot_num_node));
This looks to me like allocating an object (line 895), making it
globally visible (line 898), immediately locking it (line 899), and
initializing it (line 901 and down). This brings up two questions
1 - Aren't 898 and 899 racy, i.e. something else will grab the
object before it's locked on 899? That being the case, wouldn't it be
safer to switch those two lines.
2 - However, the bigger question is why make it visible and then
lock it? Why not initialize it and make it visible when you're done,
and not lock it at all here?
This is showing up in two more lockdep warnings, and eliminating it will
fix both of them.
I'm trying to get through our test suite without any ZFS complaints, and
more just keep popping up.
BTW, I turned your last patch into this:
diff --git a/module/zfs/dsl_pool.c b/module/zfs/dsl_pool.c
index 49e5279..4930160 100644
--- a/module/zfs/dsl_pool.c
+++ b/module/zfs/dsl_pool.c
@@ -888,14 +888,14 @@ dsl_pool_need_dirty_delay(dsl_pool_t *dp)
zfs_dirty_data_max * zfs_delay_min_dirty_percent / 100;
uint64_t dirty_min_bytes =
zfs_dirty_data_max * zfs_dirty_data_sync_percent / 100;
- boolean_t rv;
+ uint64_t dirty;
mutex_enter(&dp->dp_lock);
- if (dp->dp_dirty_total > dirty_min_bytes)
- txg_kick(dp);
- rv = (dp->dp_dirty_total > delay_min_bytes);
+ dirty = dp->dp_dirty_total;
mutex_exit(&dp->dp_lock);
- return (rv);
+ if (dirty > dirty_min_bytes)
+ txg_kick(dp);
+ return(dirty > delay_min_bytes);
}
void
This looks functionally the same and a bit simpler to me.
Jeff
…On 7/8/19 8:55 PM, Brian Behlendorf wrote:
@jdike
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_jdike&d=DwMCaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=qoeLHezHQpLd89A5Cxgm7Q&m=Ilzoo5Z1INCOonVHuXQvl01bvZPIsWKK_nLeKu4gOLE&s=ITI1fj2xW73BZr1tEqHPm7C3HCph5l4nlxX9tT0kQ28&e=>
I see, your analysis looks correct to me, and the proposed change
should make lockdep happy.
That said, now that I can see all the stacks the one calling
|txg_kick()| concerns me a bit. This shouldn't need to be called under
the |dp->dp_lock| lock, that lock is there to protect
|dp->dp_dirty_total| and |dp->dp_dirty_total|. What if instead we did
something like this, which should break the cycle and conform to the
expected ordering (entirely untested).
diff --git a/module/zfs/dsl_pool.c b/module/zfs/dsl_pool.c
index 864376c..7da78f7 100644
--- a/module/zfs/dsl_pool.c
+++ b/module/zfs/dsl_pool.c
@@ -888,14 +888,17 @@ dsl_pool_need_dirty_delay(dsl_pool_t *dp)
zfs_dirty_data_max * zfs_delay_min_dirty_percent / 100;
uint64_t dirty_min_bytes =
zfs_dirty_data_max * zfs_dirty_data_sync_percent / 100;
- boolean_t rv;
+ boolean_t do_kick, do_delay;
mutex_enter(&dp->dp_lock);
- if (dp->dp_dirty_total > dirty_min_bytes)
- txg_kick(dp);
- rv = (dp->dp_dirty_total > delay_min_bytes);
+ do_kick = (dp->dp_dirty_total > dirty_min_bytes);
+ do_delay = (dp->dp_dirty_total > delay_min_bytes);
mutex_exit(&dp->dp_lock);
- return (rv);
+
+ if (kick)
+ txg_kick(dp);
+
+ return (do_delay);
}
void
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zfsonlinux_zfs_pull_8984-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DAMP55AVKFJYWHG7572UCW73P6POZTA5CNFSM4H47B52KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZOYKBA-23issuecomment-2D509445380&d=DwMCaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=qoeLHezHQpLd89A5Cxgm7Q&m=Ilzoo5Z1INCOonVHuXQvl01bvZPIsWKK_nLeKu4gOLE&s=ay6GRsalIcxPrV-8f2v0p5OefLewcfMmkI270jw6KMc&e=>,
or mute the thread
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMP55AQR5FDT7MAAPJDFPOTP6POZTANCNFSM4H47B52A&d=DwMCaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=qoeLHezHQpLd89A5Cxgm7Q&m=Ilzoo5Z1INCOonVHuXQvl01bvZPIsWKK_nLeKu4gOLE&s=-aYVI5LUwS8sZoKnTqwDjmbpThGbNXoQemIPiXkFmAw&e=>.
|
lockdep reports a possible recursive lock in dbuf_destroy. It is true that dbuf_destroy is acquiring the dn_dbufs_mtx on one dnode while holding it on another dnode. However, it is impossible for these to be the same dnode because, among other things,dbuf_destroy checks MUTEX_HELD before acquiring the mutex. This fix defines a class NESTED_SINGLE == 1 and changes that lock to call mutex_enter_nested with a subclass of NESTED_SINGLE. In order to make the userspace code compile, include/sys/zfs_context.h now defines mutex_enter_nested and NESTED_SINGLE. This is the lockdep report: [ 122.950921] ============================================ [ 122.950921] WARNING: possible recursive locking detected [ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 #1 Tainted: G O [ 122.950921] -------------------------------------------- [ 122.950921] dbu_evict/1457 is trying to acquire lock: [ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] but task is already holding lock: [ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] other info that might help us debug this: [ 122.950921] Possible unsafe locking scenario: [ 122.950921] CPU0 [ 122.950921] ---- [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] *** DEADLOCK *** [ 122.950921] May be due to missing lock nesting notation [ 122.950921] 1 lock held by dbu_evict/1457: [ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] stack backtrace: [ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 #1 [ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009 [ 122.950921] Call Trace: [ 122.950921] dump_stack+0x91/0xeb [ 122.950921] __lock_acquire+0x2ca7/0x4f10 [ 122.950921] lock_acquire+0x153/0x330 [ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs] [ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs] [ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs] [ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs] [ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs] [ 122.950921] taskq_thread+0x979/0x1480 [spl] [ 122.950921] kthread+0x2e7/0x3e0 [ 122.950921] ret_from_fork+0x27/0x50 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jeff Dike <jdike@akamai.com> Closes openzfs#8984
lockdep reports a possible recursive lock in dbuf_destroy. It is true that dbuf_destroy is acquiring the dn_dbufs_mtx on one dnode while holding it on another dnode. However, it is impossible for these to be the same dnode because, among other things,dbuf_destroy checks MUTEX_HELD before acquiring the mutex. This fix defines a class NESTED_SINGLE == 1 and changes that lock to call mutex_enter_nested with a subclass of NESTED_SINGLE. In order to make the userspace code compile, include/sys/zfs_context.h now defines mutex_enter_nested and NESTED_SINGLE. This is the lockdep report: [ 122.950921] ============================================ [ 122.950921] WARNING: possible recursive locking detected [ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 #1 Tainted: G O [ 122.950921] -------------------------------------------- [ 122.950921] dbu_evict/1457 is trying to acquire lock: [ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] but task is already holding lock: [ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] other info that might help us debug this: [ 122.950921] Possible unsafe locking scenario: [ 122.950921] CPU0 [ 122.950921] ---- [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] *** DEADLOCK *** [ 122.950921] May be due to missing lock nesting notation [ 122.950921] 1 lock held by dbu_evict/1457: [ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] stack backtrace: [ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 #1 [ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009 [ 122.950921] Call Trace: [ 122.950921] dump_stack+0x91/0xeb [ 122.950921] __lock_acquire+0x2ca7/0x4f10 [ 122.950921] lock_acquire+0x153/0x330 [ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs] [ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs] [ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs] [ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs] [ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs] [ 122.950921] taskq_thread+0x979/0x1480 [spl] [ 122.950921] kthread+0x2e7/0x3e0 [ 122.950921] ret_from_fork+0x27/0x50 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jeff Dike <jdike@akamai.com> Closes openzfs#8984
lockdep reports a possible recursive lock in dbuf_destroy. It is true that dbuf_destroy is acquiring the dn_dbufs_mtx on one dnode while holding it on another dnode. However, it is impossible for these to be the same dnode because, among other things,dbuf_destroy checks MUTEX_HELD before acquiring the mutex. This fix defines a class NESTED_SINGLE == 1 and changes that lock to call mutex_enter_nested with a subclass of NESTED_SINGLE. In order to make the userspace code compile, include/sys/zfs_context.h now defines mutex_enter_nested and NESTED_SINGLE. This is the lockdep report: [ 122.950921] ============================================ [ 122.950921] WARNING: possible recursive locking detected [ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 #1 Tainted: G O [ 122.950921] -------------------------------------------- [ 122.950921] dbu_evict/1457 is trying to acquire lock: [ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] but task is already holding lock: [ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] other info that might help us debug this: [ 122.950921] Possible unsafe locking scenario: [ 122.950921] CPU0 [ 122.950921] ---- [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] *** DEADLOCK *** [ 122.950921] May be due to missing lock nesting notation [ 122.950921] 1 lock held by dbu_evict/1457: [ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] stack backtrace: [ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 #1 [ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009 [ 122.950921] Call Trace: [ 122.950921] dump_stack+0x91/0xeb [ 122.950921] __lock_acquire+0x2ca7/0x4f10 [ 122.950921] lock_acquire+0x153/0x330 [ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs] [ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs] [ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs] [ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs] [ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs] [ 122.950921] taskq_thread+0x979/0x1480 [spl] [ 122.950921] kthread+0x2e7/0x3e0 [ 122.950921] ret_from_fork+0x27/0x50 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jeff Dike <jdike@akamai.com> Closes openzfs#8984
lockdep reports a possible recursive lock in dbuf_destroy. It is true that dbuf_destroy is acquiring the dn_dbufs_mtx on one dnode while holding it on another dnode. However, it is impossible for these to be the same dnode because, among other things,dbuf_destroy checks MUTEX_HELD before acquiring the mutex. This fix defines a class NESTED_SINGLE == 1 and changes that lock to call mutex_enter_nested with a subclass of NESTED_SINGLE. In order to make the userspace code compile, include/sys/zfs_context.h now defines mutex_enter_nested and NESTED_SINGLE. This is the lockdep report: [ 122.950921] ============================================ [ 122.950921] WARNING: possible recursive locking detected [ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 #1 Tainted: G O [ 122.950921] -------------------------------------------- [ 122.950921] dbu_evict/1457 is trying to acquire lock: [ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] but task is already holding lock: [ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] other info that might help us debug this: [ 122.950921] Possible unsafe locking scenario: [ 122.950921] CPU0 [ 122.950921] ---- [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] *** DEADLOCK *** [ 122.950921] May be due to missing lock nesting notation [ 122.950921] 1 lock held by dbu_evict/1457: [ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] stack backtrace: [ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 #1 [ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009 [ 122.950921] Call Trace: [ 122.950921] dump_stack+0x91/0xeb [ 122.950921] __lock_acquire+0x2ca7/0x4f10 [ 122.950921] lock_acquire+0x153/0x330 [ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs] [ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs] [ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs] [ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs] [ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs] [ 122.950921] taskq_thread+0x979/0x1480 [spl] [ 122.950921] kthread+0x2e7/0x3e0 [ 122.950921] ret_from_fork+0x27/0x50 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jeff Dike <jdike@akamai.com> Closes openzfs#8984
lockdep reports a possible recursive lock in dbuf_destroy. It is true that dbuf_destroy is acquiring the dn_dbufs_mtx on one dnode while holding it on another dnode. However, it is impossible for these to be the same dnode because, among other things,dbuf_destroy checks MUTEX_HELD before acquiring the mutex. This fix defines a class NESTED_SINGLE == 1 and changes that lock to call mutex_enter_nested with a subclass of NESTED_SINGLE. In order to make the userspace code compile, include/sys/zfs_context.h now defines mutex_enter_nested and NESTED_SINGLE. This is the lockdep report: [ 122.950921] ============================================ [ 122.950921] WARNING: possible recursive locking detected [ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 #1 Tainted: G O [ 122.950921] -------------------------------------------- [ 122.950921] dbu_evict/1457 is trying to acquire lock: [ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] but task is already holding lock: [ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] other info that might help us debug this: [ 122.950921] Possible unsafe locking scenario: [ 122.950921] CPU0 [ 122.950921] ---- [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] *** DEADLOCK *** [ 122.950921] May be due to missing lock nesting notation [ 122.950921] 1 lock held by dbu_evict/1457: [ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] stack backtrace: [ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 #1 [ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009 [ 122.950921] Call Trace: [ 122.950921] dump_stack+0x91/0xeb [ 122.950921] __lock_acquire+0x2ca7/0x4f10 [ 122.950921] lock_acquire+0x153/0x330 [ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs] [ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs] [ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs] [ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs] [ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs] [ 122.950921] taskq_thread+0x979/0x1480 [spl] [ 122.950921] kthread+0x2e7/0x3e0 [ 122.950921] ret_from_fork+0x27/0x50 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jeff Dike <jdike@akamai.com> Closes openzfs#8984
lockdep reports a possible recursive lock in dbuf_destroy. It is true that dbuf_destroy is acquiring the dn_dbufs_mtx on one dnode while holding it on another dnode. However, it is impossible for these to be the same dnode because, among other things,dbuf_destroy checks MUTEX_HELD before acquiring the mutex. This fix defines a class NESTED_SINGLE == 1 and changes that lock to call mutex_enter_nested with a subclass of NESTED_SINGLE. In order to make the userspace code compile, include/sys/zfs_context.h now defines mutex_enter_nested and NESTED_SINGLE. This is the lockdep report: [ 122.950921] ============================================ [ 122.950921] WARNING: possible recursive locking detected [ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 #1 Tainted: G O [ 122.950921] -------------------------------------------- [ 122.950921] dbu_evict/1457 is trying to acquire lock: [ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] but task is already holding lock: [ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] other info that might help us debug this: [ 122.950921] Possible unsafe locking scenario: [ 122.950921] CPU0 [ 122.950921] ---- [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] *** DEADLOCK *** [ 122.950921] May be due to missing lock nesting notation [ 122.950921] 1 lock held by dbu_evict/1457: [ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] stack backtrace: [ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 #1 [ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009 [ 122.950921] Call Trace: [ 122.950921] dump_stack+0x91/0xeb [ 122.950921] __lock_acquire+0x2ca7/0x4f10 [ 122.950921] lock_acquire+0x153/0x330 [ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs] [ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs] [ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs] [ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs] [ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs] [ 122.950921] taskq_thread+0x979/0x1480 [spl] [ 122.950921] kthread+0x2e7/0x3e0 [ 122.950921] ret_from_fork+0x27/0x50 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jeff Dike <jdike@akamai.com> Closes openzfs#8984
lockdep reports a possible recursive lock in dbuf_destroy. It is true that dbuf_destroy is acquiring the dn_dbufs_mtx on one dnode while holding it on another dnode. However, it is impossible for these to be the same dnode because, among other things,dbuf_destroy checks MUTEX_HELD before acquiring the mutex. This fix defines a class NESTED_SINGLE == 1 and changes that lock to call mutex_enter_nested with a subclass of NESTED_SINGLE. In order to make the userspace code compile, include/sys/zfs_context.h now defines mutex_enter_nested and NESTED_SINGLE. This is the lockdep report: [ 122.950921] ============================================ [ 122.950921] WARNING: possible recursive locking detected [ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 #1 Tainted: G O [ 122.950921] -------------------------------------------- [ 122.950921] dbu_evict/1457 is trying to acquire lock: [ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] but task is already holding lock: [ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] other info that might help us debug this: [ 122.950921] Possible unsafe locking scenario: [ 122.950921] CPU0 [ 122.950921] ---- [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] *** DEADLOCK *** [ 122.950921] May be due to missing lock nesting notation [ 122.950921] 1 lock held by dbu_evict/1457: [ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] stack backtrace: [ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 #1 [ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009 [ 122.950921] Call Trace: [ 122.950921] dump_stack+0x91/0xeb [ 122.950921] __lock_acquire+0x2ca7/0x4f10 [ 122.950921] lock_acquire+0x153/0x330 [ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs] [ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs] [ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs] [ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs] [ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs] [ 122.950921] taskq_thread+0x979/0x1480 [spl] [ 122.950921] kthread+0x2e7/0x3e0 [ 122.950921] ret_from_fork+0x27/0x50 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jeff Dike <jdike@akamai.com> Closes openzfs#8984
lockdep reports a possible recursive lock in dbuf_destroy. It is true that dbuf_destroy is acquiring the dn_dbufs_mtx on one dnode while holding it on another dnode. However, it is impossible for these to be the same dnode because, among other things,dbuf_destroy checks MUTEX_HELD before acquiring the mutex. This fix defines a class NESTED_SINGLE == 1 and changes that lock to call mutex_enter_nested with a subclass of NESTED_SINGLE. In order to make the userspace code compile, include/sys/zfs_context.h now defines mutex_enter_nested and NESTED_SINGLE. This is the lockdep report: [ 122.950921] ============================================ [ 122.950921] WARNING: possible recursive locking detected [ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 #1 Tainted: G O [ 122.950921] -------------------------------------------- [ 122.950921] dbu_evict/1457 is trying to acquire lock: [ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] but task is already holding lock: [ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] other info that might help us debug this: [ 122.950921] Possible unsafe locking scenario: [ 122.950921] CPU0 [ 122.950921] ---- [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] *** DEADLOCK *** [ 122.950921] May be due to missing lock nesting notation [ 122.950921] 1 lock held by dbu_evict/1457: [ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] stack backtrace: [ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 #1 [ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009 [ 122.950921] Call Trace: [ 122.950921] dump_stack+0x91/0xeb [ 122.950921] __lock_acquire+0x2ca7/0x4f10 [ 122.950921] lock_acquire+0x153/0x330 [ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs] [ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs] [ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs] [ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs] [ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs] [ 122.950921] taskq_thread+0x979/0x1480 [spl] [ 122.950921] kthread+0x2e7/0x3e0 [ 122.950921] ret_from_fork+0x27/0x50 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jeff Dike <jdike@akamai.com> Closes #8984
lockdep reports a possible recursive lock in dbuf_destroy.
It is true that dbuf_destroy is acquiring the dn_dbufs_mtx
on one dnode while holding it on another dnode. However,
it is impossible for these to be the same dnode because,
among other things,dbuf_destroy checks MUTEX_HELD before
acquiring the mutex.
This fix defines a class NESTED_SINGLE == 1 and changes
that lock to call mutex_enter_nested with a subclass of
NESTED_SINGLE.
In order to make the userspace code compile,
include/sys/zfs_context.h now defines mutex_enter_nested and
NESTED_SINGLE.
This is the lockdep report:
[ 122.950921] ============================================
[ 122.950921] WARNING: possible recursive locking detected
[ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 #1 Tainted: G O
[ 122.950921] --------------------------------------------
[ 122.950921] dbu_evict/1457 is trying to acquire lock:
[ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs]
[ 122.950921]
but task is already holding lock:
[ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs]
[ 122.950921]
other info that might help us debug this:
[ 122.950921] Possible unsafe locking scenario:
[ 122.950921] CPU0
[ 122.950921] ----
[ 122.950921] lock(&dn->dn_dbufs_mtx);
[ 122.950921] lock(&dn->dn_dbufs_mtx);
[ 122.950921]
*** DEADLOCK ***
[ 122.950921] May be due to missing lock nesting notation
[ 122.950921] 1 lock held by dbu_evict/1457:
[ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs]
[ 122.950921]
stack backtrace:
[ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 #1
[ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009
[ 122.950921] Call Trace:
[ 122.950921] dump_stack+0x91/0xeb
[ 122.950921] __lock_acquire+0x2ca7/0x4f10
[ 122.950921] ? debug_show_all_locks+0x2d0/0x2d0
[ 122.950921] ? debug_show_all_locks+0x2d0/0x2d0
[ 122.950921] ? sched_clock_local+0xd8/0x130
[ 122.950921] ? sched_clock_cpu+0x133/0x170
[ 122.950921] ? arc_buf_destroy+0x21f/0x440 [zfs]
[ 122.950921] ? __lock_acquire+0xe3b/0x4f10
[ 122.950921] ? lock_acquire+0x153/0x330
[ 122.950921] lock_acquire+0x153/0x330
[ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs]
[ 122.950921] ? dbuf_destroy+0x1e2/0xdb0 [zfs]
[ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs]
[ 122.950921] __mutex_lock+0xef/0x1380
[ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs]
[ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs]
[ 122.950921] ? __mutex_add_waiter+0x160/0x160
[ 122.950921] ? sched_clock_cpu+0x133/0x170
[ 122.950921] ? __mutex_unlock_slowpath+0xf3/0x660
[ 122.950921] ? check_flags.part.23+0x480/0x480
[ 122.950921] ? zrl_add_impl+0x7e/0x380 [zfs]
[ 122.950921] ? dbuf_destroy+0x3c0/0xdb0 [zfs]
[ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs]
[ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs]
[ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs]
[ 122.950921] ? dbuf_evict_one+0x3d0/0x3d0 [zfs]
[ 122.950921] ? dbuf_destroy+0x8c4/0xdb0 [zfs]
[ 122.950921] ? rcu_read_lock_sched_held+0xeb/0x120
[ 122.950921] ? kmem_cache_free+0x1b5/0x1f0
[ 122.950921] ? arc_space_return+0x7e/0x180 [zfs]
[ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs]
[ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs]
[ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs]
[ 122.950921] taskq_thread+0x979/0x1480 [spl]
[ 122.950921] ? taskq_thread_should_stop+0x200/0x200 [spl]
[ 122.950921] ? debug_show_all_locks+0x2d0/0x2d0
[ 122.950921] ? wake_up_q+0xf0/0xf0
[ 122.950921] ? sched_clock_local+0xd8/0x130
[ 122.950921] ? dsl_dataset_check_quota+0x8a0/0x8a0 [zfs]
[ 122.950921] ? __kthread_parkme+0xad/0x180
[ 122.950921] ? taskq_thread_should_stop+0x200/0x200 [spl]
[ 122.950921] kthread+0x2e7/0x3e0
[ 122.950921] ? kthread_park+0x120/0x120
[ 122.950921] ret_from_fork+0x27/0x50
How Has This Been Tested?
The internal Akamai test suite which elicited this warning was rerun with no repeats and no regressions.
Types of changes
Checklist:
Signed-off-by
.Signed-off-by: Jeff Dike jdike@akamai.com