Skip to content

Commit

Permalink
Linux: Make zfs_prune() fair on NUMA systems
Browse files Browse the repository at this point in the history
Previous code evicted nr_to_scan items from each NUMA node.  This
not only multiplied the eviction by the number of nodes, but could
exhaust the smaller ones, evicting inodes used by acive workload
and requiring their immediate recreation.  This patch spreads the
requested eviction between all NUMA nodes proportionally to their
evictable counts, which should be closer to expected LRU logic.
See kernel's super_cache_scan() as a similar logic example.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes openzfs#16397
  • Loading branch information
amotin authored and lundman committed Sep 4, 2024
1 parent 7977567 commit 6cd9b86
Showing 1 changed file with 13 additions and 5 deletions.
18 changes: 13 additions & 5 deletions module/os/linux/zfs/zfs_vfsops.c
Original file line number Diff line number Diff line change
Expand Up @@ -1264,14 +1264,22 @@ zfs_prune(struct super_block *sb, unsigned long nr_to_scan, int *objects)
defined(SHRINK_CONTROL_HAS_NID) && \
defined(SHRINKER_NUMA_AWARE)
if (shrinker->flags & SHRINKER_NUMA_AWARE) {
long tc = 1;
for_each_online_node(sc.nid) {
long c = shrinker->count_objects(shrinker, &sc);
if (c == 0 || c == SHRINK_EMPTY)
continue;
tc += c;
}
*objects = 0;
for_each_online_node(sc.nid) {
long c = shrinker->count_objects(shrinker, &sc);
if (c == 0 || c == SHRINK_EMPTY)
continue;
if (c > tc)
tc = c;
sc.nr_to_scan = mult_frac(nr_to_scan, c, tc) + 1;
*objects += (*shrinker->scan_objects)(shrinker, &sc);
/*
* reset sc.nr_to_scan, modified by
* scan_objects == super_cache_scan
*/
sc.nr_to_scan = nr_to_scan;
}
} else {
*objects = (*shrinker->scan_objects)(shrinker, &sc);
Expand Down

0 comments on commit 6cd9b86

Please sign in to comment.