-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: change index backfill merger to use batch api #77055
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @stevendanna)
pkg/sql/backfill.go
Outdated
settings.TenantWritable, | ||
"bulkio.index_backfill.merge_batch_bytes", | ||
"the max number of bytes we merge between temporary and adding indexes in a single batch", | ||
16<<10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where did this number come from? It feels small. Ideally what we'd do is both have this be larger and pull this much out of the memory monitor before issuing it.
At the very least, comment how this number was chosen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to choose based on some prior art, we use 16 MB over in changefeeds:
const targetBytesPerScan = 16 << 20 // 16 MiB |
I opened an issue for the memory monitor here: #77123
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, I'll use 16MiB here. It looks like it should be larger than 1000 KVs from looking at indexes in the demo. Also added the memory monitor and pull the merge_batch_bytes
out of it before sending the request.
optional int64 chunk_size = 6 [(gogoproto.nullable) = false]; | ||
optional int64 chunk_bytes = 7 [(gogoproto.nullable) = false]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in retrospect I don't think it's a great idea to put either of these in the spec. Instead, can we just have each of the processors read from the setting per batch? That way you could change it at runtime without a pause-resume cycle. Just a suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, implemented reading the setting at beginning of each batch.
sourceKV := &kvs[i] | ||
destKeys := make([]roachpb.Key, len(resp.Rows)) | ||
for i := range resp.Rows { | ||
sourceKV := &resp.Rows[i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drive-by: it feel sort of lame to allocate a new byte slice here for each key in destKeys
. We could amortize the allocations by using a single slice of buffer and always appending to it and then sub-slicing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, changed implementation to use a single slice buffer.
af1f309
to
a657e1c
Compare
Use Batch API instead of txn.Scan() in order to limit the number of bytes per batch response in the index backfill merger. Fixes cockroachdb#76685. Release justification: scan API change without changing functionality Release note: None Release justification:
bors r+ |
Build succeeded: |
Use Batch API instead of txn.Scan() in order to limit the number of bytes per
batch response in the index backfill merger.
Fixes #76685.
Release note: None