Skip to content

Commit

Permalink
Fix zpl_mount() deadlock
Browse files Browse the repository at this point in the history
Commit 93b43af inadvertently introduced the following scenario which
can result in a deadlock.  This issue was most easily reproduced by
LXD containers using a ZFS storage backend but should be reproducible
under any workload which is frequently mounting and unmounting.

```
-- THREAD A --
spa_sync()
  spa_sync_upgrades()
    rrw_enter(&dp->dp_config_rwlock, RW_WRITER, FTAG); <- Waiting on B

-- THREAD B --
mount_fs()
  zpl_mount()
    zpl_mount_impl()
      dmu_objset_hold()
        dmu_objset_hold_flags()
          dsl_pool_hold()
            dsl_pool_config_enter()
              rrw_enter(&dp->dp_config_rwlock, RW_READER, tag);
    sget()
      sget_userns()
        grab_super()
          down_write(&s->s_umount); <- Waiting on C

-- THREAD C --
cleanup_mnt()
  deactivate_super()
    down_write(&s->s_umount);
    deactivate_locked_super()
      zpl_kill_sb()
        kill_anon_super()
          generic_shutdown_super()
            sync_filesystem()
              zpl_sync_fs()
                zfs_sync()
                  zil_commit()
                    txg_wait_synced() <- Waiting ON A
```

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #7691
  • Loading branch information
behlendorf committed Jul 9, 2018
1 parent 94370f5 commit 29aa053
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 1 deletion.
1 change: 1 addition & 0 deletions include/sys/zfs_vfsops.h
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
#include <sys/zil.h>
#include <sys/sa.h>
#include <sys/rrwlock.h>
#include <sys/dsl_dataset.h>
#include <sys/zfs_ioctl.h>

#ifdef __cplusplus
Expand Down
11 changes: 10 additions & 1 deletion module/zfs/zpl_super.c
Original file line number Diff line number Diff line change
Expand Up @@ -271,8 +271,17 @@ zpl_mount_impl(struct file_system_type *fs_type, int flags, zfs_mnt_t *zm)
if (err)
return (ERR_PTR(-err));

/*
* The dsl pool lock must be released prior to calling zpl_sget().
* Otherwise it is possible to block on the semaphore in grab_super(),
* which is held by deactivate_super() waiting on spa_sync(), and in
* turn the sync is blocked on zpl_mount_impl() holding the dsl pool
* lock. Only the dataset lock needs to held over the zpl_sget().
*/
dsl_pool_rele(dmu_objset_pool(os), FTAG);
s = zpl_sget(fs_type, zpl_test_super, set_anon_super, flags, os);
dmu_objset_rele(os, FTAG);
dsl_dataset_rele(dmu_objset_ds(os), FTAG);

if (IS_ERR(s))
return (ERR_CAST(s));

Expand Down

0 comments on commit 29aa053

Please sign in to comment.