-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix lockdep circular locking false positive involving sa_lock #9110
Conversation
module/zfs/sa.c
Outdated
@@ -1017,8 +1020,12 @@ sa_setup(objset_t *os, uint64_t sa_obj, sa_attr_reg_t *reg_attrs, int count, | |||
mutex_init(&sa->sa_lock, NULL, MUTEX_DEFAULT, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of creating a subclass per lock why don't we go ahead and disable lockdep checking for the sa_lock
. The following change should be sufficient.
index 4999fef34..f718e7662 100644
--- a/module/zfs/sa.c
+++ b/module/zfs/sa.c
@@ -1014,7 +1014,7 @@ sa_setup(objset_t *os, uint64_t sa_obj, sa_attr_reg_t *reg_attrs, int count,
}
sa = kmem_zalloc(sizeof (sa_os_t), KM_SLEEP);
- mutex_init(&sa->sa_lock, NULL, MUTEX_DEFAULT, NULL);
+ mutex_init(&sa->sa_lock, NULL, MUTEX_NOLOCKDEP, NULL);
sa->sa_master_obj = sa_obj;
os->os_sa = sa;
Hi Brian,
Thank you - that is a lot more pleasant.
Jeff
…On 8/15/19 6:46 PM, Brian Behlendorf wrote:
***@***.**** requested changes on this pull request.
------------------------------------------------------------------------
In module/zfs/sa.c
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zfsonlinux_zfs_pull_9110-23discussion-5Fr314527678&d=DwMCaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=qoeLHezHQpLd89A5Cxgm7Q&m=NogpYV3qNNKVnyZgK8L09cLt2EQD4G2IuFgU4-nzzgo&s=N8um-6Pn5IT6JUnyW-s9M63BqyYXaeLaTL4HtFT_d9k&e=>:
> @@ -1017,8 +1020,12 @@ sa_setup(objset_t *os, uint64_t sa_obj, sa_attr_reg_t *reg_attrs, int count,
mutex_init(&sa->sa_lock, NULL, MUTEX_DEFAULT, NULL);
Instead of creating a subclass per lock why don't we go ahead and
disable lockdep checking for the |sa_lock|. The following change
should be sufficient.
index 4999fef34..f718e7662 100644
--- a/module/zfs/sa.c
+++ b/module/zfs/sa.c
@@ -1014,7 +1014,7 @@ sa_setup(objset_t *os, uint64_t sa_obj, sa_attr_reg_t *reg_attrs, int count,
}
sa = kmem_zalloc(sizeof (sa_os_t), KM_SLEEP);
- mutex_init(&sa->sa_lock, NULL, MUTEX_DEFAULT, NULL);
+ mutex_init(&sa->sa_lock, NULL, MUTEX_NOLOCKDEP, NULL);
sa->sa_master_obj = sa_obj;
os->os_sa = sa;
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zfsonlinux_zfs_pull_9110-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DAMP55AVYK5S5BTQLDVET2QTQEXMDBA5CNFSM4IIVJVQ2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCBXPCIA-23pullrequestreview-2D275706144&d=DwMCaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=qoeLHezHQpLd89A5Cxgm7Q&m=NogpYV3qNNKVnyZgK8L09cLt2EQD4G2IuFgU4-nzzgo&s=G48OFLq9_STI2ejiF1LKLyisZwbsDy5KNT1geqSIn9w&e=>,
or mute the thread
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMP55AUCTTVQXEDYTB3D2WDQEXMDBANCNFSM4IIVJVQQ&d=DwMCaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=qoeLHezHQpLd89A5Cxgm7Q&m=NogpYV3qNNKVnyZgK8L09cLt2EQD4G2IuFgU4-nzzgo&s=QVMZ5keWLS9dCPYXvMseEY3EVPfzol5JDkmB4WcHS8M&e=>.
|
There are two different deadlock scenarios, but they share a common link, which is thread 1 holding sa_lock and trying to get zap->zap_rwlock: zap_lockdir_impl+0x858/0x16c0 [zfs] zap_lockdir+0xd2/0x100 [zfs] zap_lookup_norm+0x7f/0x100 [zfs] zap_lookup+0x12/0x20 [zfs] sa_setup+0x902/0x1380 [zfs] zfsvfs_init+0x3d6/0xb20 [zfs] zfsvfs_create+0x5dd/0x900 [zfs] zfs_domount+0xa3/0xe20 [zfs] and thread 2 trying to get sa_lock, either in sa_setup: sa_setup+0x742/0x1380 [zfs] zfsvfs_init+0x3d6/0xb20 [zfs] zfsvfs_create+0x5dd/0x900 [zfs] zfs_domount+0xa3/0xe20 [zfs] or in sa_build_index: sa_build_index+0x13d/0x790 [zfs] sa_handle_get_from_db+0x368/0x500 [zfs] zfs_znode_sa_init.isra.0+0x24b/0x330 [zfs] zfs_znode_alloc+0x3da/0x1a40 [zfs] zfs_zget+0x39a/0x6e0 [zfs] zfs_root+0x101/0x160 [zfs] zfs_domount+0x91f/0xea0 [zfs] From there, there are different locking paths back to something holding zap->zap_rwlock. The deadlock scenarios involve multiple different ZFS filesystems being mounted. sa_lock is common to these scenarios, and the sa struct involved is private to a mount. Therefore, these must be referring to different sa_lock instances and these deadlocks can't occur in practice. The fix, from Brian Behlendorf, is to remove sa_lock from lockdep coverage by initializing it with MUTEX_NOLOCKDEP. Signed-off-by: Jeff Dike <jdike@akamai.com>
Codecov Report
@@ Coverage Diff @@
## master #9110 +/- ##
==========================================
+ Coverage 79.31% 79.65% +0.33%
==========================================
Files 400 279 -121
Lines 121942 80846 -41096
==========================================
- Hits 96716 64394 -32322
+ Misses 25226 16452 -8774
Continue to review full report at Codecov.
|
There are two different deadlock scenarios, but they share a common link, which is thread 1 holding sa_lock and trying to get zap->zap_rwlock: zap_lockdir_impl+0x858/0x16c0 [zfs] zap_lockdir+0xd2/0x100 [zfs] zap_lookup_norm+0x7f/0x100 [zfs] zap_lookup+0x12/0x20 [zfs] sa_setup+0x902/0x1380 [zfs] zfsvfs_init+0x3d6/0xb20 [zfs] zfsvfs_create+0x5dd/0x900 [zfs] zfs_domount+0xa3/0xe20 [zfs] and thread 2 trying to get sa_lock, either in sa_setup: sa_setup+0x742/0x1380 [zfs] zfsvfs_init+0x3d6/0xb20 [zfs] zfsvfs_create+0x5dd/0x900 [zfs] zfs_domount+0xa3/0xe20 [zfs] or in sa_build_index: sa_build_index+0x13d/0x790 [zfs] sa_handle_get_from_db+0x368/0x500 [zfs] zfs_znode_sa_init.isra.0+0x24b/0x330 [zfs] zfs_znode_alloc+0x3da/0x1a40 [zfs] zfs_zget+0x39a/0x6e0 [zfs] zfs_root+0x101/0x160 [zfs] zfs_domount+0x91f/0xea0 [zfs] From there, there are different locking paths back to something holding zap->zap_rwlock. The deadlock scenarios involve multiple different ZFS filesystems being mounted. sa_lock is common to these scenarios, and the sa struct involved is private to a mount. Therefore, these must be referring to different sa_lock instances and these deadlocks can't occur in practice. The fix, from Brian Behlendorf, is to remove sa_lock from lockdep coverage by initializing it with MUTEX_NOLOCKDEP. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jeff Dike <jdike@akamai.com> Closes openzfs#9110
There are two different deadlock scenarios, but they share a common link, which is thread 1 holding sa_lock and trying to get zap->zap_rwlock: zap_lockdir_impl+0x858/0x16c0 [zfs] zap_lockdir+0xd2/0x100 [zfs] zap_lookup_norm+0x7f/0x100 [zfs] zap_lookup+0x12/0x20 [zfs] sa_setup+0x902/0x1380 [zfs] zfsvfs_init+0x3d6/0xb20 [zfs] zfsvfs_create+0x5dd/0x900 [zfs] zfs_domount+0xa3/0xe20 [zfs] and thread 2 trying to get sa_lock, either in sa_setup: sa_setup+0x742/0x1380 [zfs] zfsvfs_init+0x3d6/0xb20 [zfs] zfsvfs_create+0x5dd/0x900 [zfs] zfs_domount+0xa3/0xe20 [zfs] or in sa_build_index: sa_build_index+0x13d/0x790 [zfs] sa_handle_get_from_db+0x368/0x500 [zfs] zfs_znode_sa_init.isra.0+0x24b/0x330 [zfs] zfs_znode_alloc+0x3da/0x1a40 [zfs] zfs_zget+0x39a/0x6e0 [zfs] zfs_root+0x101/0x160 [zfs] zfs_domount+0x91f/0xea0 [zfs] From there, there are different locking paths back to something holding zap->zap_rwlock. The deadlock scenarios involve multiple different ZFS filesystems being mounted. sa_lock is common to these scenarios, and the sa struct involved is private to a mount. Therefore, these must be referring to different sa_lock instances and these deadlocks can't occur in practice. The fix, from Brian Behlendorf, is to remove sa_lock from lockdep coverage by initializing it with MUTEX_NOLOCKDEP. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jeff Dike <jdike@akamai.com> Closes openzfs#9110
There are two different deadlock scenarios, but they share a common link, which is thread 1 holding sa_lock and trying to get zap->zap_rwlock: zap_lockdir_impl+0x858/0x16c0 [zfs] zap_lockdir+0xd2/0x100 [zfs] zap_lookup_norm+0x7f/0x100 [zfs] zap_lookup+0x12/0x20 [zfs] sa_setup+0x902/0x1380 [zfs] zfsvfs_init+0x3d6/0xb20 [zfs] zfsvfs_create+0x5dd/0x900 [zfs] zfs_domount+0xa3/0xe20 [zfs] and thread 2 trying to get sa_lock, either in sa_setup: sa_setup+0x742/0x1380 [zfs] zfsvfs_init+0x3d6/0xb20 [zfs] zfsvfs_create+0x5dd/0x900 [zfs] zfs_domount+0xa3/0xe20 [zfs] or in sa_build_index: sa_build_index+0x13d/0x790 [zfs] sa_handle_get_from_db+0x368/0x500 [zfs] zfs_znode_sa_init.isra.0+0x24b/0x330 [zfs] zfs_znode_alloc+0x3da/0x1a40 [zfs] zfs_zget+0x39a/0x6e0 [zfs] zfs_root+0x101/0x160 [zfs] zfs_domount+0x91f/0xea0 [zfs] From there, there are different locking paths back to something holding zap->zap_rwlock. The deadlock scenarios involve multiple different ZFS filesystems being mounted. sa_lock is common to these scenarios, and the sa struct involved is private to a mount. Therefore, these must be referring to different sa_lock instances and these deadlocks can't occur in practice. The fix, from Brian Behlendorf, is to remove sa_lock from lockdep coverage by initializing it with MUTEX_NOLOCKDEP. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jeff Dike <jdike@akamai.com> Closes #9110
Remove sa_lock from lockdep coverage
Motivation and Context
There are two different deadlock scenarios, but they share a common link, which is
thread 1 holding sa_lock and trying to get zap->zap_rwlock:
zap_lockdir_impl+0x858/0x16c0 [zfs]
zap_lockdir+0xd2/0x100 [zfs]
zap_lookup_norm+0x7f/0x100 [zfs]
zap_lookup+0x12/0x20 [zfs]
sa_setup+0x902/0x1380 [zfs]
zfsvfs_init+0x3d6/0xb20 [zfs]
zfsvfs_create+0x5dd/0x900 [zfs]
zfs_domount+0xa3/0xe20 [zfs]
and thread 2 trying to get sa_lock, either in sa_setup:
sa_setup+0x742/0x1380 [zfs]
zfsvfs_init+0x3d6/0xb20 [zfs]
zfsvfs_create+0x5dd/0x900 [zfs]
zfs_domount+0xa3/0xe20 [zfs]
or in sa_build_index:
sa_build_index+0x13d/0x790 [zfs]
sa_handle_get_from_db+0x368/0x500 [zfs]
zfs_znode_sa_init.isra.0+0x24b/0x330 [zfs]
zfs_znode_alloc+0x3da/0x1a40 [zfs]
zfs_zget+0x39a/0x6e0 [zfs]
zfs_root+0x101/0x160 [zfs]
zfs_domount+0x91f/0xea0 [zfs]
From there, there are different locking paths back to something holding zap->zap_rwlock.
The deadlock scenarios involve multiple different ZFS filesystems being mounted. sa_lock is common to these scenarios, and the sa struct involved is private to a mount. Therefore, these must be referring to different sa_lock instances and these deadlocks can't occur in practice.
The fix, from Brian Behlendorf, is to remove sa_lock from lockdep coverage by initializing it with MUTEX_NOLOCKDEP.
Signed-off-by: Jeff Dike jdike@akamai.com
How Has This Been Tested?
Soaked for multiple days on multiple boxes with our test suite. No remaining lockdep warnings after this change.
Types of changes
Checklist:
Signed-off-by
.