Skip to content
This repository has been archived by the owner on Feb 26, 2020. It is now read-only.

Commit

Permalink
spl-kmem: Enforce architecture-specific barriers around clear_bit()
Browse files Browse the repository at this point in the history
The comment above the Linux 3.17 kernel's clear_bit() states:

/**
 * clear_bit - Clears a bit in memory
 * @nr: Bit to clear
 * @addr: Address to start counting from
 *
 * clear_bit() is atomic and may not be reordered.  However, it does
 * not contain a memory barrier, so if it is used for locking purposes,
 * you should call smp_mb__before_atomic() and/or smp_mb__after_atomic()
 * in order to ensure changes are visible on other processors.
 */

This comment does not make sense in the context of x86 because x86 maps these
operations to barrier(), which is a compiler barrier. However, it does make
sense to me when I consider architectures that reorder around atomic
instructions. In such situations, a processor is allowed to execute the
wake_up_bit() before clear_bit() and we have a race. There are a few
architectures that suffer from this issue:

http://lxr.free-electrons.com/source/arch/arm/include/asm/barrier.h?v=3.16#L83
http://lxr.free-electrons.com/source/arch/arm64/include/asm/barrier.h?v=3.16#L102
http://lxr.free-electrons.com/source/arch/mips/include/asm/barrier.h?v=3.16#L199
http://lxr.free-electrons.com/source/arch/powerpc/include/asm/barrier.h?v=3.16#L88
http://lxr.free-electrons.com/source/arch/s390/include/asm/barrier.h?v=3.16#L32
http://lxr.free-electrons.com/source/arch/tile/include/asm/barrier.h?v=3.16#L83
https://en.wikipedia.org/wiki/Memory_ordering

In such situations, the other processor would wake-up, see the bit is still
taken and go to sleep, while the one responsible for waking it up will assume
that it did its job and continue.

https://github.com/torvalds/linux/blob/master/arch/x86/include/asm/bitops.h#L100

It is important to note that
smp_mb__{before,after}_{atomic,clear}_{dec,inc,bit}(), were replaced by
smp_mb__{before,after}_atomic() in recent kernels:

torvalds/linux@febdbfe

Some compatibility code was added to replace it in the time being, although it
does not interact well with -Werror:

http://www.spinics.net/lists/backports/msg02669.html
http://lxr.free-electrons.com/source/include/linux/bitops.h?v=3.16#L48

In addition, the kernel's code paths are using clear_bit_unlock() in situations
where clear_bit is used for unlocking. This adds smp_mb__before_atomic(), which
I assume is for Alpha.

This patch implements a wrapper that maps smp_mb__{before,after}_atomic() to
smp_mb__{before,after}_clear_bit() on older kernels and changes our code to
leverage it in a manner consistent with the mainine kernel.

Signed-off-by: Richard Yao <ryao@gentoo.org>
  • Loading branch information
ryao committed Nov 3, 2014
1 parent 291a4b8 commit 340b095
Showing 1 changed file with 21 additions and 4 deletions.
25 changes: 21 additions & 4 deletions module/spl/spl-kmem.c
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,19 @@
#undef kmem_cache_alloc
#undef kmem_cache_free

/*
* Linux 3.17 replaced smp_mb__{before,after}_{atomic,clear}_{dec,inc,bit}()
* with smp_mb__{before,after}_atomic() because they were redundant. This is
* only used inside our SLAB allocator, so we implement an internal wrapper
* here to give us smp_mb__{before,after}_atomic() on older kernels.
*/
#ifndef smp_mb__before_atomic
#define smp_mb__before_atomic(x) smp_mb__before_clear_bit(x)
#endif

#ifndef smp_mb__after_atomic
#define smp_mb__after_atomic(x) smp_mb__after_clear_bit(x)
#endif

/*
* Cache expiration was implemented because it was part of the default Solaris
Expand Down Expand Up @@ -1584,8 +1597,10 @@ spl_cache_grow_work(void *data)
}

atomic_dec(&skc->skc_ref);
smp_mb__before_atomic();
clear_bit(KMC_BIT_GROWING_HIGH, &skc->skc_flags);
clear_bit(KMC_BIT_DEADLOCKED, &skc->skc_flags);
smp_mb__after_atomic();
wake_up_all(&skc->skc_waitq);
spin_unlock(&skc->skc_lock);

Expand Down Expand Up @@ -1643,7 +1658,8 @@ spl_cache_grow(spl_kmem_cache_t *skc, int flags, void **obj)
&skc->skc_partial_list);
spin_unlock(&skc->skc_lock);

clear_bit(KMC_BIT_GROWING, &skc->skc_flags);
clear_bit_unlock(KMC_BIT_GROWING, &skc->skc_flags);
smp_mb__after_atomic();
wake_up_bit(&skc->skc_flags, KMC_BIT_GROWING);
} else {
spl_wait_on_bit(&skc->skc_flags, KMC_BIT_GROWING,
Expand All @@ -1666,7 +1682,8 @@ spl_cache_grow(spl_kmem_cache_t *skc, int flags, void **obj)
ska = kmalloc(sizeof(*ska),
kmem_flags_convert(flags | KM_NOSLEEP));
if (ska == NULL) {
clear_bit(KMC_BIT_GROWING_HIGH, &skc->skc_flags);
clear_bit_unlock(KMC_BIT_GROWING_HIGH, &skc->skc_flags);
smp_mb__after_atomic();
wake_up_all(&skc->skc_waitq);
SRETURN(-ENOMEM);
}
Expand Down Expand Up @@ -2135,8 +2152,8 @@ spl_kmem_cache_reap_now(spl_kmem_cache_t *skc, int count)
}

spl_slab_reclaim(skc, count, 1);
clear_bit(KMC_BIT_REAPING, &skc->skc_flags);
smp_mb__after_clear_bit();
clear_bit_unlock(KMC_BIT_REAPING, &skc->skc_flags);
smp_mb__after_atomic();
wake_up_bit(&skc->skc_flags, KMC_BIT_REAPING);
out:
atomic_dec(&skc->skc_ref);
Expand Down

0 comments on commit 340b095

Please sign in to comment.