Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
74602: kvserver: add `WholeRowsOfSize` request parameter for scans r=erikgrinaker a=erikgrinaker **roachpb: rename `TargetBytesAllowEmpty` to `AllowEmpty`** The `TargetBytesAllowEmpty` parameter currently only applies to `TargetBytes` limits, but will shortly apply to `MaxSpanRequestKeys` as well (when the scan API becomes SQL row-aware and may discard the first row if it's incomplete). This patch therefore renames the parameter to `AllowEmpty`, to make it limit-agnostic. There are no behavioral changes. Release note: None **storage: add benchmark support for SQL row data** Release note: None **kvserver: add `WholeRowsOfSize` request parameter for scans** This patch adds a `WholeRowsOfSize` parameter to request headers, and a corresponding `ScanWholeRows` version gate, which prevents scan requests from including partial SQL rows at the end of the result. See `api.proto` for details. This implementation requires the maximum row size to be plumbed down from SQL. This is a performance optimization which allows use of a fixed-size ring buffer to keep track of the last KV byte offsets in `pebbleMVCCScanner`. Two alternative implementations were attempted: keeping track of all KV byte offsets while scanning, and decoding every key's row prefix while scanning to track the last row's starting offset. Both of these alternatives predictably had significant overhead (~10%). However, this patch comes with a 2-3% performance penalty for scans even when `WholeRowsOfSize` is disabled, due to code restructuring and additional checks. Efforts to recover this performance have stalled. ``` name old time/op new time/op delta MVCCScan_Pebble/rows=1/versions=1/valueSize=64-24 4.25µs ± 1% 4.26µs ± 1% ~ (p=0.406 n=16+16) MVCCScan_Pebble/rows=1/versions=10/valueSize=64-24 6.10µs ± 1% 6.13µs ± 1% ~ (p=0.065 n=16+14) MVCCScan_Pebble/rows=100/versions=1/valueSize=64-24 36.4µs ± 1% 37.3µs ± 1% +2.57% (p=0.000 n=16+15) MVCCScan_Pebble/rows=100/versions=10/valueSize=64-24 122µs ± 2% 123µs ± 1% ~ (p=0.068 n=16+13) MVCCScan_Pebble/rows=10000/versions=1/valueSize=64-24 2.29ms ± 1% 2.36ms ± 1% +3.08% (p=0.000 n=16+16) MVCCScan_Pebble/rows=10000/versions=10/valueSize=64-24 9.15ms ± 1% 9.22ms ± 1% +0.75% (p=0.000 n=16+16) ``` Enabling `WholeRowsOfSize` has an additional 0-1.5% penalty for large scans, as seen below. However, notice that single-key scans with `versions=10` can have penalties as high as 7%. This is because rows may omit column families that are all `NULL`, so in order to know whether the single key is a complete row or not the scan must continue to the next key. This additional scan can be relatively expensive for single-key scans if there is a lot of MVCC garbage. However, if the last row contains the final column family when the limit is hit, this penalty can be avoided: compare how `rows=3/columnFamilies=3` and `rows=1/columnFamilies=3` change from `versions=1` to `versions=10`. ``` name old time/op new time/op delta MVCCScan_PebbleSQLRows/rows=1/columnFamilies=1/versions=1/valueSize=64-24 4.38µs ± 1% 4.40µs ± 1% ~ (p=0.075 n=16+14) MVCCScan_PebbleSQLRows/rows=1/columnFamilies=1/versions=10/valueSize=64-24 6.22µs ± 1% 6.18µs ± 1% -0.66% (p=0.004 n=14+15) MVCCScan_PebbleSQLRows/rows=1/columnFamilies=3/versions=1/valueSize=64-24 4.39µs ± 1% 4.33µs ± 1% -1.20% (p=0.000 n=16+16) MVCCScan_PebbleSQLRows/rows=1/columnFamilies=3/versions=10/valueSize=64-24 6.16µs ± 1% 6.54µs ± 1% +6.12% (p=0.000 n=15+16) MVCCScan_PebbleSQLRows/rows=1/columnFamilies=10/versions=1/valueSize=64-24 4.38µs ± 1% 4.35µs ± 1% -0.64% (p=0.000 n=16+14) MVCCScan_PebbleSQLRows/rows=1/columnFamilies=10/versions=10/valueSize=64-24 6.08µs ± 1% 6.50µs ± 1% +6.88% (p=0.000 n=16+16) MVCCScan_PebbleSQLRows/rows=3/columnFamilies=1/versions=1/valueSize=64-24 5.35µs ± 2% 5.36µs ± 1% ~ (p=0.277 n=16+15) MVCCScan_PebbleSQLRows/rows=3/columnFamilies=1/versions=10/valueSize=64-24 9.33µs ± 1% 9.33µs ± 1% ~ (p=0.904 n=16+16) MVCCScan_PebbleSQLRows/rows=3/columnFamilies=3/versions=1/valueSize=64-24 5.34µs ± 1% 5.46µs ± 1% +2.14% (p=0.000 n=14+16) MVCCScan_PebbleSQLRows/rows=3/columnFamilies=3/versions=10/valueSize=64-24 9.32µs ± 2% 9.45µs ± 2% +1.41% (p=0.000 n=16+16) MVCCScan_PebbleSQLRows/rows=3/columnFamilies=10/versions=1/valueSize=64-24 5.32µs ± 1% 5.18µs ± 1% -2.50% (p=0.000 n=16+15) MVCCScan_PebbleSQLRows/rows=3/columnFamilies=10/versions=10/valueSize=64-24 9.22µs ± 1% 9.45µs ± 1% +2.54% (p=0.000 n=16+15) MVCCScan_PebbleSQLRows/rows=100/columnFamilies=1/versions=1/valueSize=64-24 35.9µs ± 1% 35.7µs ± 1% -0.70% (p=0.001 n=15+15) MVCCScan_PebbleSQLRows/rows=100/columnFamilies=1/versions=10/valueSize=64-24 117µs ± 1% 117µs ± 1% ~ (p=0.323 n=16+16) MVCCScan_PebbleSQLRows/rows=100/columnFamilies=3/versions=1/valueSize=64-24 36.7µs ± 1% 37.0µs ± 1% +0.82% (p=0.000 n=15+16) MVCCScan_PebbleSQLRows/rows=100/columnFamilies=3/versions=10/valueSize=64-24 118µs ± 1% 119µs ± 1% +0.89% (p=0.000 n=16+16) MVCCScan_PebbleSQLRows/rows=100/columnFamilies=10/versions=1/valueSize=64-24 35.9µs ± 0% 36.3µs ± 2% +1.26% (p=0.000 n=13+15) MVCCScan_PebbleSQLRows/rows=100/columnFamilies=10/versions=10/valueSize=64-24 116µs ± 1% 117µs ± 1% +0.77% (p=0.000 n=15+16) MVCCScan_PebbleSQLRows/rows=10000/columnFamilies=1/versions=1/valueSize=64-24 2.41ms ± 1% 2.41ms ± 1% ~ (p=0.094 n=16+16) MVCCScan_PebbleSQLRows/rows=10000/columnFamilies=1/versions=10/valueSize=64-24 9.11ms ± 1% 9.10ms ± 1% ~ (p=0.822 n=14+16) MVCCScan_PebbleSQLRows/rows=10000/columnFamilies=3/versions=1/valueSize=64-24 2.52ms ± 1% 2.53ms ± 1% ~ (p=0.591 n=14+15) MVCCScan_PebbleSQLRows/rows=10000/columnFamilies=3/versions=10/valueSize=64-24 9.28ms ± 1% 9.33ms ± 1% +0.54% (p=0.010 n=16+16) MVCCScan_PebbleSQLRows/rows=10000/columnFamilies=10/versions=1/valueSize=64-24 2.44ms ± 1% 2.46ms ± 1% +0.96% (p=0.000 n=15+13) MVCCScan_PebbleSQLRows/rows=10000/columnFamilies=10/versions=10/valueSize=64-24 9.13ms ± 1% 9.18ms ± 1% +0.59% (p=0.006 n=15+16) ``` Resolves #73618. Release note: None 75115: sql: fix call of `FindIndexWithName` r=postamar a=otan In e89093f, we added `tableDesc.FindIndexWithName(name.String())`. However, `name.String()` can return a `EncodeRestrictedSQLIdent` version of the string. This fixes the call to use `string(name)`. I couldn't produce an error with this at the moment, mostly because there is some other check preventing it from happening from the combinations I've tried. Release note: None 75139: kvserver: add `SSTTimestamp` parameter for `AddSSTable` r=dt a=erikgrinaker This patch adds an `SSTTimestamp` parameter for `AddSSTable`. When set, the client promises that all MVCC timestamps in the given SST are equal to `SSTTimestamp`. When used together with `WriteAtRequestTimestamp`, this can avoid the cost of rewriting the SST timestamps if the `SSTTimestamp` already equals the request `Timestamp`. Release note: None Co-authored-by: Erik Grinaker <grinaker@cockroachlabs.com> Co-authored-by: Oliver Tan <otan@cockroachlabs.com>
- Loading branch information