Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deadlocks in DMU #726

Closed
wants to merge 4 commits into from
Closed

Fix deadlocks in DMU #726

wants to merge 4 commits into from

Commits on Jun 29, 2012

  1. Revert Fix ASSERTION(!dsl_pool_sync_context(tx->tx_pool))

    Commit eec8164 worked around an issue
    involving direct reclaim through the use of PF_MEMALLOC.   Since we
    are reworking thing to use KM_PUSHPAGE so that swap works, we revert
    this patch in favor of the use of KM_PUSHPAGE in the affected areas.
    
    Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Issue openzfs#726
    ryao committed Jun 29, 2012
    Configuration menu
    Copy the full SHA
    7fdf111 View commit details
    Browse the repository at this point in the history
  2. Revert Fix zpl_writepage() deadlock

    The commit, cfc9a5c, to fix deadlocks
    in zpl_writepage() relied on PF_MEMALLOC.   That had the effect of
    disabling the direct reclaim path on all allocations originating from
    calls to this function, but it failed to address the actual cause of
    those deadlocks.  This led to the same deadlocks being observed with
    swap on zvols, but not with swap on the loop device, which exercises
    this code.
    
    The use of PF_MEMALLOC also had the side effect of permitting
    allocations to be made from ZONE_DMA in instances that did not require
    it.  This contributes to the possibility of panics caused by depletion
    of pages from ZONE_DMA.
    
    As such, we revert this patch in favor of a proper fix for both issues.
    
    Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Issue openzfs#726
    ryao committed Jun 29, 2012
    Configuration menu
    Copy the full SHA
    77d8b3e View commit details
    Browse the repository at this point in the history
  3. Revert Disable direct reclaim for z_wr_* threads

    This commit used PF_MEMALLOC to prevent a memory reclaim deadlock.
    However, commit 49be0cc eliminated
    the invocation of __cv_init(), which was the cause of the deadlock.
    PF_MEMALLOC has the side effect of permitting pages from ZONE_DMA
    to be allocated.  The use of PF_MEMALLOC was found to cause stability
    problems when doing swap on zvols. Since this technique is known to
    cause problems and no longer fixes anything, we revert it.
    
    Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Issue openzfs#726
    ryao committed Jun 29, 2012
    Configuration menu
    Copy the full SHA
    1dfb3c3 View commit details
    Browse the repository at this point in the history
  4. Switch KM_SLEEP to KM_PUSHPAGE

    Differences between how paging is done on Solaris and Linux can cause
    deadlocks if KM_SLEEP is used in any the following contexts.
    
      * The z_wr_* threads
      * The txg_sync thread
      * The zvol write/discard threads
      * The zpl_putpage() VFS callback
    
    This is because KM_SLEEP will allow for direct reclaim which may result
    in the VM calling back in to the filesystem or block layer to write out
    pages.  If a lock is held over this operation the potential exists to
    deadlock the system.  To ensure forward progress all memory allocations
    in these contexts must us KM_PUSHPAGE which disables performing any I/O
    to accomplish the memory allocation.
    
    Previously, this behavior was acheived by setting PF_MEMALLOC on the
    thread.  However, that resulted in unexpected side effects such as the
    exhaustion of pages in ZONE_DMA.  This approach touchs more of the zfs
    code, but it is more consistent with the right way to handle these cases
    under Linux.
    
    This is patch lays the ground work for being able to safely revert the
    following commits which used PF_MEMALLOC:
    
      21ade34 Disable direct reclaim for z_wr_* threads
      cfc9a5c Fix zpl_writepage() deadlock
      eec8164 Fix ASSERTION(!dsl_pool_sync_context(tx->tx_pool))
    
    Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Issue openzfs#726
    ryao committed Jun 29, 2012
    Configuration menu
    Copy the full SHA
    0aee477 View commit details
    Browse the repository at this point in the history