Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util: support spill offset into disk when spilling #34212

Merged
merged 11 commits into from
May 5, 2022

Conversation

wshwsh12
Copy link
Contributor

@wshwsh12 wshwsh12 commented Apr 25, 2022

What problem does this PR solve?

Issue Number: ref #33877
close #35634

Problem Summary:

What is changed and how it works?

Support spilling offsets of each row in ListInDisk, which reduces the memory usage for spilling.

Benchmark:

Master:
BenchmarkListInDiskAdd
BenchmarkListInDiskAdd-8   	 2148649	       526.5 ns/op
BenchmarkListInDiskGetRow
BenchmarkListInDiskGetRow-8   	  403568	      2814 ns/op

This PR:
BenchmarkListInDiskAdd
BenchmarkListInDiskAdd-8   	 1991194	       576.6 ns/op
BenchmarkListInDiskGetRow
BenchmarkListInDiskGetRow-8   	  318237	      3696 ns/op

Heap Pprof: Sort 1e8 rows and spill. We can see the memory usage of chunkInDisk.WriteTo disappeared

Master:
Showing nodes accounting for 1760.58MB, 98.76% of 1782.60MB total
Dropped 212 nodes (cum <= 8.91MB)
Showing top 10 nodes out of 34
      flat  flat%   sum%        cum   cum%
  762.96MB 42.80% 42.80%   762.96MB 42.80%  github.com/pingcap/tidb/util/chunk.(*SortedRowContainer).Sort
  733.70MB 41.16% 83.96%   733.70MB 41.16%  github.com/pingcap/tidb/util/chunk.(*chunkInDisk).WriteTo
  204.52MB 11.47% 95.43%   204.52MB 11.47%  github.com/pingcap/tidb/store/copr.(*copIteratorWorker).handleCopResponse

This PR:
Showing nodes accounting for 1041.51MB, 97.18% of 1071.69MB total
Dropped 238 nodes (cum <= 5.36MB)
Showing top 20 nodes out of 33
      flat  flat%   sum%        cum   cum%
  762.95MB 71.19% 71.19%   762.95MB 71.19%  github.com/pingcap/tidb/util/chunk.(*SortedRowContainer).Sort
  204.52MB 19.08% 90.28%   204.52MB 19.08%  github.com/pingcap/tidb/store/copr.(*copIteratorWorker).handleCopResponse
   74.03MB  6.91% 97.18%    74.03MB  6.91%  google.golang.org/grpc.(*parser).recvMsg
         0     0% 97.18%    74.03MB  6.91%  github.com/pingcap/kvproto/pkg/tikvpb.(*tikvBatchCommandsClient).Recv

Performance Test: Sort 1e8 rows and set memory quota to 1GB, tidb will spill data to disk. This pr will use more disk space.

Master:
[tidb]> desc analyze  select * from v10kw big_table order by big_table.a;
+------------------------------+--------------+-----------+-----------+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------+---------+---------+
| id                           | estRows      | actRows   | task      | access object    | execution info                                                                                                                                                                                                                                                                                                                      | operator info             | memory  | disk    |
+------------------------------+--------------+-----------+-----------+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------+---------+---------+
| Sort_9                       | 100000000.00 | 100000000 | root      |                  | time:6m5.5s, loops:97658                                                                                                                                                                                                                                                                                                            | test.base_table.a         | 1.24 GB | 2.88 GB |
| └─Limit_12                   | 100000000.00 | 100000000 | root      |                  | time:23.9s, loops:97720                                                                                                                                                                                                                                                                                                             | offset:0, count:100000000 | N/A     | N/A     |
|   └─TableReader_16           | 100000000.00 | 100000554 | root      |                  | time:23.9s, loops:97719, cop_task: {num: 107, max: 1.47s, min: 198.9ms, avg: 682.2ms, p95: 1.26s, max_proc_keys: 1436388, p95_proc_keys: 1340635, tot_proc: 1m2.7s, tot_wait: 1.06s, rpc_num: 107, rpc_time: 1m13s, copr_cache_hit_ratio: 0.00}                                                                                     | data:Limit_15             | 93.9 MB | N/A     |
|     └─Limit_15               | 100000000.00 | 100248665 | cop[tikv] |                  | tikv_task:{proc max:1.15s, min:173ms, p80:763ms, p95:872ms, iters:98390, tasks:107}, scan_detail: {total_process_keys: 100248665, total_process_keys_size: 3709200605, total_keys: 100248772, rocksdb: {delete_skipped_count: 0, key_skipped_count: 100248665, block: {cache_hit_count: 76095, read_count: 0, read_byte: 0 Bytes}}} | offset:0, count:100000000 | N/A     | N/A     |
|       └─TableFullScan_14     | 100000000.00 | 100248665 | cop[tikv] | table:base_table | tikv_task:{proc max:1.15s, min:173ms, p80:763ms, p95:872ms, iters:98390, tasks:107}                                                                                                                                                                                                                                                 | keep order:false          | N/A     | N/A     |
+------------------------------+--------------+-----------+-----------+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------+---------+---------+
5 rows in set (6 min 5.717 sec)

This PR:
[tidb]> desc analyze  select * from v10kw big_table order by big_table.a;
+------------------------------+--------------+-----------+-----------+------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------+---------+---------+
| id                           | estRows      | actRows   | task      | access object    | execution info                                                                                                                                                                                                                                                                                                                         | operator info             | memory  | disk    |
+------------------------------+--------------+-----------+-----------+------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------+---------+---------+
| Sort_9                       | 100000000.00 | 100000000 | root      |                  | time:7m36.8s, loops:97658                                                                                                                                                                                                                                                                                                              | test.base_table.a         | 1.20 GB | 3.55 GB |
| └─Limit_12                   | 100000000.00 | 100000000 | root      |                  | time:26.9s, loops:97720                                                                                                                                                                                                                                                                                                                | offset:0, count:100000000 | N/A     | N/A     |
|   └─TableReader_16           | 100000000.00 | 100000554 | root      |                  | time:26.9s, loops:97719, cop_task: {num: 107, max: 2.18s, min: 196.4ms, avg: 947.8ms, p95: 2.09s, max_proc_keys: 1436388, p95_proc_keys: 1340635, tot_proc: 1m3.5s, tot_wait: 220ms, rpc_num: 107, rpc_time: 1m41.4s, copr_cache_hit_ratio: 0.00}                                                                                      | data:Limit_15             | 97.7 MB | N/A     |
|     └─Limit_15               | 100000000.00 | 100248665 | cop[tikv] |                  | tikv_task:{proc max:1.26s, min:168ms, p80:757ms, p95:974ms, iters:98390, tasks:107}, scan_detail: {total_process_keys: 100248665, total_process_keys_size: 3709200605, total_keys: 100248772, rocksdb: {delete_skipped_count: 0, key_skipped_count: 100248665, block: {cache_hit_count: 200, read_count: 75895, read_byte: 122.1 MB}}} | offset:0, count:100000000 | N/A     | N/A     |
|       └─TableFullScan_14     | 100000000.00 | 100248665 | cop[tikv] | table:base_table | tikv_task:{proc max:1.26s, min:168ms, p80:757ms, p95:974ms, iters:98390, tasks:107}                                                                                                                                                                                                                                                    | keep order:false          | N/A     | N/A     |
+------------------------------+--------------+-----------+-----------+------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------+---------+---------+
5 rows in set (7 min 37.166 sec)

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Apr 25, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • XuHuaiyu
  • hawkingrei

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 25, 2022
@sre-bot
Copy link
Contributor

sre-bot commented Apr 25, 2022

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Apr 25, 2022
@wshwsh12 wshwsh12 force-pushed the spill-offset-for-list-in-disk branch from c572458 to 0bb0cc5 Compare April 25, 2022 07:28
offsetFile tempFileWithIOWrapper
}

type tempFileWithIOWrapper struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diskFileReaderWriter ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment for this definition

offWrite int64
fieldTypes []*types.FieldType
numRowForEachChunk []int
numRowPrefixSum []int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment for this

offWrite int64
fieldTypes []*types.FieldType
numRowForEachChunk []int
numRowPrefixSum []int

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

useless empty line

return underlying
}

func (l *tempFileWithIOWrapper) getSelectionReader(off int64) *io.SectionReader {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SectionReader ? not SelectionReader

return 0, err
}
if n != 8 {
return 0, errors2.New("Can not get offset from disk")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file spilled is broken, can not get data offset from the disk.

func (chk *chunkInDisk) getOffsetsOfRows() offsetsOfRows { return chk.offsetsOfRows }

// WriteTo serializes the offsetsOfRow, and writes to w.
func (off offsetsOfRows) WriteTo(w io.Writer) (written int64, err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this function to line line262?

@@ -145,25 +143,47 @@ type listInDiskWriteDisk struct {
ListInDisk
}

func (l *tempFileWithIOWrapper) flushForTest() (err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems flush is not an appropriate name for now?

}

type tempFileWithIOWrapper struct {
// diskFileReaderWriter represents a Reader and a Writer for the temporary disk file,
// without considering the detail of checksum and encryption.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cipherWriter is only enabled when ...

@@ -33,16 +33,17 @@ import (
type ListInDisk struct {
fieldTypes []*types.FieldType
numRowForEachChunk []int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numRowsOfEachChunk
rowNumOfEachChunkFirstRow
totalNumRows

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels May 5, 2022
@wshwsh12
Copy link
Contributor Author

wshwsh12 commented May 5, 2022

/merge

1 similar comment
@XuHuaiyu
Copy link
Contributor

XuHuaiyu commented May 5, 2022

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: b636337

1 similar comment
@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: b636337

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label May 5, 2022
@wshwsh12
Copy link
Contributor Author

wshwsh12 commented May 5, 2022

/run-mysql-test

@wshwsh12
Copy link
Contributor Author

wshwsh12 commented May 5, 2022

/run-mysql-test

@ti-chi-bot ti-chi-bot merged commit ad6d620 into pingcap:master May 5, 2022
@sre-bot
Copy link
Contributor

sre-bot commented May 5, 2022

TiDB MergeCI notify

CI Name Result Duration Compare with Parent commit
idc-jenkins-ci/integration-cdc-test 🟢 all 34 tests passed 31 min Existing passed
idc-jenkins-ci-tidb/integration-common-test 🟢 all 11 tests passed 12 min Existing passed
idc-jenkins-ci-tidb/sqllogic-test-2 🟢 all 28 tests passed 10 min Existing passed
idc-jenkins-ci-tidb/sqllogic-test-1 🟢 all 26 tests passed 9 min 28 sec Existing passed
idc-jenkins-ci-tidb/common-test 🟢 all 12 tests passed 8 min 24 sec Existing passed
idc-jenkins-ci-tidb/integration-compatibility-test 🟢 all 1 tests passed 7 min 14 sec Existing passed
idc-jenkins-ci-tidb/integration-ddl-test 🟢 all 6 tests passed 7 min 3 sec Existing passed
idc-jenkins-ci-tidb/tics-test 🟢 all 1 tests passed 5 min 31 sec Existing passed
idc-jenkins-ci-tidb/mybatis-test 🟢 all 1 tests passed 3 min 57 sec Existing passed
idc-jenkins-ci-tidb/plugin-test 🟢 build success, plugin test success 4min Existing passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use non-extra-memory spilling disk implementation to avoid offsets memory usages
5 participants