Skip to content

Commit

Permalink
kvserver: don't use ClearRange point deletes with estimated MVCC stats
Browse files Browse the repository at this point in the history
`ClearRange` avoids dropping a Pebble range tombstone if the amount of
data that's deleted is small (<=512 KB), instead dropping point
deletions. It uses MVCC statistics to determine this. However, when
clearing an entire range, it will rely on the existing range MVCC stats
rather than computing them.

These range statistics can be highly inaccurate -- in some cases so
inaccurate that they even become negative. This in turn can cause
`ClearRange` to submit a huge write batch, which gets rejected by Raft
with `command too large`.

This patch avoids dropping point deletes if the statistics are estimated
(which is only the case when clearing an entire range). Alternatively,
it could do a full stats recomputation in this case, but entire range
deletions seem likely to be large and/or rare enough that dropping a
range tombstone is fine.

Release note (bug fix): Fixed a bug where deleting data via schema
changes (e.g. when dropping an index or table) could fail with a
"command too large" error.
  • Loading branch information
erikgrinaker committed Jan 11, 2022
1 parent 51af1ca commit b621c07
Showing 1 changed file with 9 additions and 2 deletions.
11 changes: 9 additions & 2 deletions pkg/kv/kvserver/batcheval/cmd_clear_range.go
Original file line number Diff line number Diff line change
Expand Up @@ -97,8 +97,15 @@ func ClearRange(
// If the total size of data to be cleared is less than
// clearRangeBytesThreshold, clear the individual values with an iterator,
// instead of using a range tombstone (inefficient for small ranges).
if total := statsDelta.Total(); total < ClearRangeBytesThreshold {
log.VEventf(ctx, 2, "delta=%d < threshold=%d; using non-range clear", total, ClearRangeBytesThreshold)
//
// However, don't do this if the stats contain estimates -- this can only
// happen when we're clearing an entire range and we're using the existing
// range stats. We've seen cases where these estimates are wildly inaccurate
// (even negative), and it's better to drop an unnecessary range tombstone
// than to submit a huge write batch that'll get rejected by Raft.
if statsDelta.ContainsEstimates != 0 && statsDelta.Total() < ClearRangeBytesThreshold {
log.VEventf(ctx, 2, "delta=%d < threshold=%d; using non-range clear",
statsDelta.Total(), ClearRangeBytesThreshold)
iter := readWriter.NewMVCCIterator(storage.MVCCKeyAndIntentsIterKind, storage.IterOptions{
LowerBound: from,
UpperBound: to,
Expand Down

0 comments on commit b621c07

Please sign in to comment.