Skip to content

Commit

Permalink
Fix zpool on zvol lock inversion deadlock
Browse files Browse the repository at this point in the history
In all but one case the spa_namespace_lock is taken before the
bdev->bd_mutex lock.  But Linux __blkdev_get() function calls
fops->open() with the bdev->bd_mutex lock held and we must
somehow still safely acquire the spa_namespace_lock.

To avoid a potential lock inversion deadlock we preemptively
try to take the spa_namespace_lock().  Normally it will not
be contended and this is safe because spa_open_common() handles
the case where the caller already holds the spa_namespace_lock.

When it is contended we risk a lock inversion if we were to
block waiting for the lock.  Luckily, the __blkdev_get()
function allows us to return -ERESTARTSYS which will result in
bdev->bd_mutex being dropped, reacquired, and fops->open() being
called again.  This process can be repeated safely until both
locks are acquired.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #612
  • Loading branch information
behlendorf committed Dec 20, 2012
1 parent d5446cf commit 65d5608
Showing 1 changed file with 28 additions and 0 deletions.
28 changes: 28 additions & 0 deletions module/zfs/zvol.c
Original file line number Diff line number Diff line change
Expand Up @@ -891,11 +891,39 @@ zvol_first_open(zvol_state_t *zv)
{
objset_t *os;
uint64_t volsize;
int locked = 0;
int error;
uint64_t ro;

/*
* In all other cases the spa_namespace_lock is taken before the
* bdev->bd_mutex lock. But in this case the Linux __blkdev_get()
* function calls fops->open() with the bdev->bd_mutex lock held.
*
* To avoid a potential lock inversion deadlock we preemptively
* try to take the spa_namespace_lock(). Normally it will not
* be contended and this is safe because spa_open_common() handles
* the case where the caller already holds the spa_namespace_lock.
*
* When it is contended we risk a lock inversion if we were to
* block waiting for the lock. Luckily, the __blkdev_get()
* function allows us to return -ERESTARTSYS which will result in
* bdev->bd_mutex being dropped, reacquired, and fops->open() being
* called again. This process can be repeated safely until both
* locks are acquired.
*/
if (!mutex_owned(&spa_namespace_lock)) {
locked = mutex_tryenter(&spa_namespace_lock);
if (!locked)
return (-ERESTARTSYS);
}

/* lie and say we're read-only */
error = dmu_objset_own(zv->zv_name, DMU_OST_ZVOL, 1, zvol_tag, &os);

if (locked)
mutex_exit(&spa_namespace_lock);

if (error)
return (-error);

Expand Down

0 comments on commit 65d5608

Please sign in to comment.