Skip to content

Commit

Permalink
Several sorted scrub optimizations.
Browse files Browse the repository at this point in the history
 - Reduce size and comparison complexity of q_exts_by_size B-tree.
Previous code used two 64-bit divisions and many other operations to
compare two B-tree elements.  It created enormous overhead.  This
implementation moves the math to the upper level and stores the score
in the B-tree elements themselves.  Since all that we need to store in
that B-tree is the extent score and offset, those can fit into single
8 byte value instead of 24 bytes of q_exts_by_addr element and can be
compared with single operation.
 - Better decouple secondary tree logic from main range_tree by moving
rt_btree_ops and related functions into dsl_scan.c as ext_size_ops.
Those functions are very small to worry about the code duplication and
range_tree does not need to know details such as rt_btree_compare.
 - Instead of accounting number of pending bytes per pool, that needs
atomic on global variable per block, account the number of non-empty
per-vdev queues, that change much more rarely.
 - When extent scan is interrupted by TXG end, continue it in the next
TXG instead of selecting next best extent.  It allows to avoid leaving
one truncated (and so likely not the best any more) extent each TXG.

On top of some other optimizations this saves about 1.5 minutes out of
10 to scrub pool of 12 SSDs, storing 1.5TB of 4KB zvol blocks.

Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
  • Loading branch information
amotin committed Jun 21, 2022
1 parent d51f4ea commit 9e67075
Show file tree
Hide file tree
Showing 4 changed files with 155 additions and 201 deletions.
2 changes: 1 addition & 1 deletion include/sys/dsl_scan.h
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ typedef struct dsl_scan {
dsl_scan_phys_t scn_phys; /* on disk representation of scan */
dsl_scan_phys_t scn_phys_cached;
avl_tree_t scn_queue; /* queue of datasets to scan */
uint64_t scn_bytes_pending; /* outstanding data to issue */
uint64_t scn_queues_pending; /* outstanding data to issue */
} dsl_scan_t;

typedef struct dsl_scan_io_queue dsl_scan_io_queue_t;
Expand Down
15 changes: 2 additions & 13 deletions include/sys/range_tree.h
Original file line number Diff line number Diff line change
Expand Up @@ -64,11 +64,7 @@ typedef struct range_tree {
uint8_t rt_shift;
uint64_t rt_start;
const range_tree_ops_t *rt_ops;

/* rt_btree_compare should only be set if rt_arg is a b-tree */
void *rt_arg;
int (*rt_btree_compare)(const void *, const void *);

uint64_t rt_gap; /* allowable inter-segment gap */

/*
Expand Down Expand Up @@ -278,9 +274,9 @@ rs_set_fill(range_seg_t *rs, range_tree_t *rt, uint64_t fill)

typedef void range_tree_func_t(void *arg, uint64_t start, uint64_t size);

range_tree_t *range_tree_create_impl(const range_tree_ops_t *ops,
range_tree_t *range_tree_create_gap(const range_tree_ops_t *ops,
range_seg_type_t type, void *arg, uint64_t start, uint64_t shift,
int (*zfs_btree_compare) (const void *, const void *), uint64_t gap);
uint64_t gap);
range_tree_t *range_tree_create(const range_tree_ops_t *ops,
range_seg_type_t type, void *arg, uint64_t start, uint64_t shift);
void range_tree_destroy(range_tree_t *rt);
Expand Down Expand Up @@ -316,13 +312,6 @@ void range_tree_remove_xor_add_segment(uint64_t start, uint64_t end,
void range_tree_remove_xor_add(range_tree_t *rt, range_tree_t *removefrom,
range_tree_t *addto);

void rt_btree_create(range_tree_t *rt, void *arg);
void rt_btree_destroy(range_tree_t *rt, void *arg);
void rt_btree_add(range_tree_t *rt, range_seg_t *rs, void *arg);
void rt_btree_remove(range_tree_t *rt, range_seg_t *rs, void *arg);
void rt_btree_vacate(range_tree_t *rt, void *arg);
extern const range_tree_ops_t rt_btree_ops;

#ifdef __cplusplus
}
#endif
Expand Down
Loading

0 comments on commit 9e67075

Please sign in to comment.