sql: change index backfill merger to use batch api #77055

rhu713 · 2022-02-25T20:21:50Z

Use Batch API instead of txn.Scan() in order to limit the number of bytes per
batch response in the index backfill merger.

Fixes #76685.

Release note: None

cockroach-teamcity · 2022-02-25T20:21:58Z

This change is

chengxiong-ruan

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @stevendanna)

ajwerner · 2022-02-28T02:21:39Z

pkg/sql/backfill.go

+	settings.TenantWritable,
+	"bulkio.index_backfill.merge_batch_bytes",
+	"the max number of bytes we merge between temporary and adding indexes in a single batch",
+	16<<10,


Where did this number come from? It feels small. Ideally what we'd do is both have this be larger and pull this much out of the memory monitor before issuing it.

At the very least, comment how this number was chosen.

If we want to choose based on some prior art, we use 16 MB over in changefeeds:

cockroach/pkg/ccl/changefeedccl/kvfeed/scanner.go

Line 121 in 4fa089a

const targetBytesPerScan = 16 << 20 // 16 MiB

Although we run those scans in parallel which we don't do yet here.

I opened an issue for the memory monitor here: #77123

Done, I'll use 16MiB here. It looks like it should be larger than 1000 KVs from looking at indexes in the demo. Also added the memory monitor and pull the merge_batch_bytes out of it before sending the request.

ajwerner · 2022-02-28T02:23:27Z

pkg/sql/execinfrapb/processors_bulk_io.proto

  optional int64 chunk_size = 6 [(gogoproto.nullable) = false];
+  optional int64 chunk_bytes = 7 [(gogoproto.nullable) = false];


in retrospect I don't think it's a great idea to put either of these in the spec. Instead, can we just have each of the processors read from the setting per batch? That way you could change it at runtime without a pause-resume cycle. Just a suggestion.

Done, implemented reading the setting at beginning of each batch.

ajwerner · 2022-02-28T02:28:52Z

pkg/sql/backfill/mvcc_index_merger.go

-			sourceKV := &kvs[i]
+		destKeys := make([]roachpb.Key, len(resp.Rows))
+		for i := range resp.Rows {
+			sourceKV := &resp.Rows[i]


Drive-by: it feel sort of lame to allocate a new byte slice here for each key in destKeys. We could amortize the allocations by using a single slice of buffer and always appending to it and then sub-slicing it.

Done, changed implementation to use a single slice buffer.

Use Batch API instead of txn.Scan() in order to limit the number of bytes per batch response in the index backfill merger. Fixes cockroachdb#76685. Release justification: scan API change without changing functionality Release note: None Release justification:

rhu713 · 2022-03-03T18:40:04Z

bors r+

craig · 2022-03-03T20:20:48Z

Build succeeded:

GitHub CI (Cockroach)

rhu713 requested review from a team and stevendanna and removed request for a team February 25, 2022 20:21

rhu713 requested a review from a team February 25, 2022 20:22

chengxiong-ruan approved these changes Feb 25, 2022

View reviewed changes

ajwerner reviewed Feb 28, 2022

View reviewed changes

rhu713 force-pushed the ib-merge-api branch 3 times, most recently from af1f309 to a657e1c Compare March 1, 2022 14:17

rhu713 force-pushed the ib-merge-api branch from a657e1c to a4fc929 Compare March 1, 2022 15:53

craig bot merged commit 960f2b4 into cockroachdb:master Mar 3, 2022

stevendanna mentioned this pull request Mar 28, 2022

sql: add memory monitor to IndexBackfillMerger #77123

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: change index backfill merger to use batch api #77055

sql: change index backfill merger to use batch api #77055

rhu713 commented Feb 25, 2022

cockroach-teamcity commented Feb 25, 2022

chengxiong-ruan left a comment

ajwerner Feb 28, 2022

stevendanna Feb 28, 2022 •

edited

Loading

rhu713 Feb 28, 2022

ajwerner Feb 28, 2022

rhu713 Feb 28, 2022

ajwerner Feb 28, 2022

rhu713 Feb 28, 2022

rhu713 commented Mar 3, 2022

craig bot commented Mar 3, 2022

		optional int64 chunk_size = 6 [(gogoproto.nullable) = false];
		optional int64 chunk_bytes = 7 [(gogoproto.nullable) = false];

sql: change index backfill merger to use batch api #77055

sql: change index backfill merger to use batch api #77055

Conversation

rhu713 commented Feb 25, 2022

cockroach-teamcity commented Feb 25, 2022

chengxiong-ruan left a comment

Choose a reason for hiding this comment

ajwerner Feb 28, 2022

Choose a reason for hiding this comment

stevendanna Feb 28, 2022 • edited Loading

Choose a reason for hiding this comment

rhu713 Feb 28, 2022

Choose a reason for hiding this comment

ajwerner Feb 28, 2022

Choose a reason for hiding this comment

rhu713 Feb 28, 2022

Choose a reason for hiding this comment

ajwerner Feb 28, 2022

Choose a reason for hiding this comment

rhu713 Feb 28, 2022

Choose a reason for hiding this comment

rhu713 commented Mar 3, 2022

craig bot commented Mar 3, 2022

stevendanna Feb 28, 2022 •

edited

Loading