distsqlrun: bump index joiner batch size from 100 to 10k #38622

asubiotto · 2019-07-02T20:11:52Z

To amortize the cost of looking up rows, we create a batch of 100 rows
to use in one lookup request. Since the relationship is 1:1 in the index
joiner (we're doing a lookup on the primary index), the result size is
the same size as the request batch.

This commit increases the size of this batch to 10k, increasing the
result size for each lookup to 10k. This results in some significant
performance gains: e.g. tpch query 6 drops to a fifth of its original
runtime on a scalefactor 10 dataset due to amortizing lookups. Note that
this comes with increased memory usage per request. However, the
tableReader limits results for its scans to 10k as well, and there is no
good reason to allow normal scans to use more memory than index joiner
lookups. In the absence of proper accounting for KV responses, the
strategy of allowing index lookups to use the same resources and have
the same limitations as normal scans makes sense.

Release note: None

To amortize the cost of looking up rows, we create a batch of 100 rows to use in one lookup request. Since the relationship is 1:1 in the index joiner (we're doing a lookup on the primary index), the result size is the same size as the request batch. This commit increases the size of this batch to 10k, increasing the result size for each lookup to 10k. This results in some significant performance gains: e.g. tpch query 6 drops to a fifth of its original runtime on a scalefactor 10 dataset due to amortizing lookups. Note that this comes with increased memory usage per request. However, the tableReader limits results for its scans to 10k as well, and there is no good reason to allow normal scans to use more memory than index joiner lookups. In the absence of proper accounting for KV responses, the strategy of allowing index lookups to use the same resources and have the same limitations as normal scans makes sense. Release note: None

cockroach-teamcity · 2019-07-02T20:11:58Z

This change is

asubiotto · 2019-07-09T14:02:22Z

friendly ping

jordanlewis · 2019-07-09T14:12:49Z

LGTM. It makes me kinda nervous that we have no way to test this. Make sure to pay attention to the nightlies and so on once this goes in.

asubiotto · 2019-07-09T14:25:39Z

bors r=jordanlewis

craig · 2019-07-09T14:42:55Z

Build failed (retrying...)

GitHub CI (Cockroach)

craig · 2019-07-09T15:10:37Z

Build failed (retrying...)

GitHub CI (Cockroach)

craig · 2019-07-09T15:41:05Z

Build failed (retrying...)

GitHub CI (Cockroach)

38622: distsqlrun: bump index joiner batch size from 100 to 10k r=jordanlewis a=asubiotto To amortize the cost of looking up rows, we create a batch of 100 rows to use in one lookup request. Since the relationship is 1:1 in the index joiner (we're doing a lookup on the primary index), the result size is the same size as the request batch. This commit increases the size of this batch to 10k, increasing the result size for each lookup to 10k. This results in some significant performance gains: e.g. tpch query 6 drops to a fifth of its original runtime on a scalefactor 10 dataset due to amortizing lookups. Note that this comes with increased memory usage per request. However, the tableReader limits results for its scans to 10k as well, and there is no good reason to allow normal scans to use more memory than index joiner lookups. In the absence of proper accounting for KV responses, the strategy of allowing index lookups to use the same resources and have the same limitations as normal scans makes sense. Release note: None Co-authored-by: Alfonso Subiotto Marqués <alfonso@cockroachlabs.com>

craig · 2019-07-09T16:24:02Z

Build succeeded

GitHub CI (Cockroach)

asubiotto requested review from jordanlewis and a team July 2, 2019 20:11

craig bot merged commit a45912c into cockroachdb:master Jul 9, 2019

asubiotto deleted the ijbs branch July 15, 2019 19:34

asubiotto mentioned this pull request Aug 8, 2019

distsql: change join reader batch size to be specified in bytes #39471

Closed

knz mentioned this pull request Nov 10, 2019

User-facing changes in 19.2 that were not picked up in release notes cockroachdb/docs#5819

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

distsqlrun: bump index joiner batch size from 100 to 10k #38622

distsqlrun: bump index joiner batch size from 100 to 10k #38622

asubiotto commented Jul 2, 2019

cockroach-teamcity commented Jul 2, 2019

asubiotto commented Jul 9, 2019

jordanlewis commented Jul 9, 2019

asubiotto commented Jul 9, 2019

craig bot commented Jul 9, 2019

craig bot commented Jul 9, 2019

craig bot commented Jul 9, 2019

craig bot commented Jul 9, 2019

distsqlrun: bump index joiner batch size from 100 to 10k #38622

distsqlrun: bump index joiner batch size from 100 to 10k #38622

Conversation

asubiotto commented Jul 2, 2019

cockroach-teamcity commented Jul 2, 2019

asubiotto commented Jul 9, 2019

jordanlewis commented Jul 9, 2019

asubiotto commented Jul 9, 2019

craig bot commented Jul 9, 2019

Build failed (retrying...)

craig bot commented Jul 9, 2019

Build failed (retrying...)

craig bot commented Jul 9, 2019

Build failed (retrying...)

craig bot commented Jul 9, 2019

Build succeeded