release-23.1: kvstreamer: improve avg response estimation for partial responses #104134

blathers-crl · 2023-05-31T10:54:39Z

Backport 1/1 commits from #103602 on behalf of @yuzefovich.

/cc @cockroachdb/release

This commit fixes the avg response estimator when it comes to handling
partial responses. "Partial responses" mean that a single ScanRequest
was evaluated across multiple BatchRequests because the response
happened to be large enough to exceed TargetBytes budget we used. For
example, if a ScanRequest needs to fetch 100KiB of data, but the
Streamer only gives 1KiB TargetBytes budget, then this ScanRequest will
be evaluated across 100 BatchRequests, and each BatchRequest would
contain a single "partial" response of 1KiB in size.

Previously, the avg response estimator would treat these partial
responses independently, so it would come up with the estimate of 1KiB
for that example. However, this is very suboptimal - the goal of the
estimator is to guess the footprint of the whole response to a
single-range request. As a result of the previous behavior, the streamer
could keep on paginating the ScanResponse in very small batches (with
the estimate never increasing) which would then lead to long latency for
query evaluation.

This commit fixes this problem by adjusting what the estimator includes
in the denominator for the average computation. In particular, this
commit makes it so that only "non-resume" responses are included into
that number. The idea is that when we receive the first paginated
response, we increase the counter of responses, but on all consequent
"resume" responses - we don't. This allows us to "update" the footprint
of the big ScanResponse that is being paginated across multiple
BatchRequests. In the example above, our estimate will be growing
exponentially, and as a result, instead of performing 100 BatchRequests,
we will now do only 7.

This requires tracking whether we have received a response to
a particular ScanRequest, and in order to optimize the space usage of
such tracking, a simple utility bitmap package is introduced.

Impact on TPCH (average over 50 runs of tpchvec/perf):

Q1:	before: 3.21s	after: 3.23s	 0.79%
Q2:	before: 3.38s	after: 3.16s	 -6.42%
Q3:	before: 2.56s	after: 2.55s	 -0.11%
Q4:	before: 1.76s	after: 1.61s	 -8.41%
Q5:	before: 2.55s	after: 2.47s	 -3.37%
Q6:	before: 4.64s	after: 4.65s	 0.21%
Q7:	before: 5.89s	after: 5.56s	 -5.57%
Q8:	before: 1.09s	after: 1.07s	 -1.33%
Q9:	before: 5.61s	after: 5.55s	 -1.05%
Q10:	before: 2.21s	after: 2.09s	 -5.47%
Q11:	before: 0.97s	after: 0.94s	 -2.64%
Q12:	before: 4.88s	after: 4.43s	 -9.31%
Q13:	before: 1.15s	after: 1.01s	 -11.92%
Q14:	before: 0.45s	after: 0.45s	 1.16%
Q15:	before: 2.53s	after: 2.51s	 -0.58%
Q16:	before: 0.92s	after: 0.90s	 -2.38%
Q17:	before: 0.24s	after: 0.24s	 -0.58%
Q18:	before: 2.03s	after: 2.02s	 -0.46%
Q19:	before: 0.48s	after: 0.48s	 -0.99%
Q20:	before: 9.74s	after: 9.71s	 -0.38%
Q21:	before: 5.05s	after: 5.01s	 -0.81%
Q22:	before: 0.58s	after: 0.57s	 -0.38%

Addresses: #82164.
Fixes: #103586.

Release note: None

Release justification: bug fix.

This commit fixes the avg response estimator when it comes to handling partial responses. "Partial responses" mean that a single ScanRequest was evaluated across multiple BatchRequests because the response happened to be large enough to exceed TargetBytes budget we used. For example, if a ScanRequest needs to fetch 100KiB of data, but the Streamer only gives 1KiB TargetBytes budget, then this ScanRequest will be evaluated across 100 BatchRequests, and each BatchRequest would contain a single "partial" response of 1KiB in size. Previously, the avg response estimator would treat these partial responses independently, so it would come up with the estimate of 1KiB for that example. However, this is very suboptimal - the goal of the estimator is to guess the footprint of the whole response to a single-range request. As a result of the previous behavior, the streamer could keep on paginating the ScanResponse in very small batches (with the estimate never increasing) which would then lead to long latency for query evaluation. This commit fixes this problem by adjusting what the estimator includes in the denominator for the average computation. In particular, this commit makes it so that only "non-resume" responses are included into that number. The idea is that when we receive the first paginated response, we increase the counter of responses, but on all consequent "resume" responses - we don't. This allows us to "update" the footprint of the big ScanResponse that is being paginated across multiple BatchRequests. In the example above, our estimate will be growing exponentially, and as a result, instead of performing 100 BatchRequests, we will now do only 7. This requires tracking whether we have received a response to a particular ScanRequest, and in order to optimize the space usage of such tracking, a simple utility bitmap package is introduced. Impact on TPCH (average over 50 runs of `tpchvec/perf`): ``` Q1: before: 3.21s after: 3.23s 0.79% Q2: before: 3.38s after: 3.16s -6.42% Q3: before: 2.56s after: 2.55s -0.11% Q4: before: 1.76s after: 1.61s -8.41% Q5: before: 2.55s after: 2.47s -3.37% Q6: before: 4.64s after: 4.65s 0.21% Q7: before: 5.89s after: 5.56s -5.57% Q8: before: 1.09s after: 1.07s -1.33% Q9: before: 5.61s after: 5.55s -1.05% Q10: before: 2.21s after: 2.09s -5.47% Q11: before: 0.97s after: 0.94s -2.64% Q12: before: 4.88s after: 4.43s -9.31% Q13: before: 1.15s after: 1.01s -11.92% Q14: before: 0.45s after: 0.45s 1.16% Q15: before: 2.53s after: 2.51s -0.58% Q16: before: 0.92s after: 0.90s -2.38% Q17: before: 0.24s after: 0.24s -0.58% Q18: before: 2.03s after: 2.02s -0.46% Q19: before: 0.48s after: 0.48s -0.99% Q20: before: 9.74s after: 9.71s -0.38% Q21: before: 5.05s after: 5.01s -0.81% Q22: before: 0.58s after: 0.57s -0.38% ``` Release note: None

blathers-crl · 2023-05-31T10:54:41Z

cockroach-teamcity · 2023-05-31T10:54:52Z

This change is

DrewKimball

Reviewed 9 of 9 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach, @michae2, and @yuzefovich)

blathers-crl bot requested a review from a team as a code owner May 31, 2023 10:54

blathers-crl bot force-pushed the blathers/backport-release-23.1-103602 branch from 481994e to 30f9c3f Compare May 31, 2023 10:54

blathers-crl bot requested a review from cucaroach May 31, 2023 10:54

blathers-crl bot force-pushed the blathers/backport-release-23.1-103602 branch from 954d6f4 to ec84897 Compare May 31, 2023 10:54

blathers-crl bot added blathers-backport This is a backport that Blathers created automatically. O-robot Originated from a bot. labels May 31, 2023

blathers-crl bot assigned yuzefovich May 31, 2023

blathers-crl bot requested review from DrewKimball, michae2 and yuzefovich May 31, 2023 10:54

DrewKimball approved these changes Jun 2, 2023

View reviewed changes

yuzefovich merged commit cd51cf8 into release-23.1 Jun 2, 2023

yuzefovich deleted the blathers/backport-release-23.1-103602 branch June 2, 2023 20:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release-23.1: kvstreamer: improve avg response estimation for partial responses #104134

release-23.1: kvstreamer: improve avg response estimation for partial responses #104134

blathers-crl bot commented May 31, 2023 •

edited by yuzefovich

Loading

blathers-crl bot commented May 31, 2023 •

edited by yuzefovich

Loading

cockroach-teamcity commented May 31, 2023

DrewKimball left a comment

release-23.1: kvstreamer: improve avg response estimation for partial responses #104134

release-23.1: kvstreamer: improve avg response estimation for partial responses #104134

Conversation

blathers-crl bot commented May 31, 2023 • edited by yuzefovich Loading

blathers-crl bot commented May 31, 2023 • edited by yuzefovich Loading

cockroach-teamcity commented May 31, 2023

DrewKimball left a comment

Choose a reason for hiding this comment

blathers-crl bot commented May 31, 2023 •

edited by yuzefovich

Loading

blathers-crl bot commented May 31, 2023 •

edited by yuzefovich

Loading